Devpost
Participate in our public hackathons
Devpost for Teams
Access your company's private hackathons
Grow your developer ecosystem and promote your platform
Drive innovation, collaboration, and retention within your organization
By use case
Blog
Insights into hackathon planning and participation
Customer stories
Inspiration from peers and other industry leaders
Planning guides
Best practices for planning online and in-person hackathons
Webinars & events
Upcoming events and on-demand recordings
Help desk
Common questions and support documentation
the full-duplex vision-language-action model
Emoji-to-emoji diffusion: a low-dimensional proxy for text that reasons in parallel, not word-by-word. A 328M model that tops the frontier at emoji infill—and scales with compute.
A multi-agent harness Cli bump Devin with Opus 4.8 on Swe-Bench pro from 50% to 70%
Vending machines that stay profitable - Bankruptcy-gated vending sim with in-world turn cost, plus a genetic outer loop that maintains diverse operating postures under selection pressure.
Cell signaling is attention. We simulate tissue as transformer forward passes, a physics engine for biology where you drop in cancer or a virus, native on transformer silicon.
Docker infrastructure for memory and context in AI agents
A jailbreak detector that escalates compute only when unsure, beating always-on detection by 2pts recall at 1.6x less compute on JailbreakBench.
We introduce a hedge fund run by a fleet of small LLMs that write their own trading strategies as code and ship them in milliseconds. Tap to buy in, watch it beat typical buy-and-old in real-time.
Making clinical trials recruiting easier.
Never let an AI repeat a workflow twice - let a stronger coding agent solve the problem and distill this knowledge to a 10x cheaper and 10x faster open-source model overnight.
Orchestration on coding agent Watch it. Checkpoint it. Retry it — smarter & cheaper.
We give every student a real-time AI tutor that detects confusion during class and instantly provides personalized explanations before they fall behind.
FlashGrep replaces RAG indexes with live LLM semantic scoring. It caches results so users/agents can refine instantly, add fresh data, and search large corpora faster as compute scales.
Qssessment turns job requirements into interactive AI interviews that test how candidates ask questions and reason through ambiguity.
Verified, hands-free first aid: a frozen open model pushed to 98% with inference-time compute, never an unverified step.
Project Falcon makes long-context AI cheaper and more reliable: same hardware, half the memory budget, better answers.
The cheapest roads through space already exist. Gravity built them. We just couldn't afford to find them... until now.
Alpha Go moment for Clash Royale
Ads are broken for AI chat. We let advertisers bid for test-time compute, not attention: relevant products buy the agent more reasoning to prove they fit your private context.
Autoreduce is distributed autoresearch for ML systems, helping agents discover both the best algorithms and the GPU scale where they work best.
xLift is a pre-training data scout which predicts which cohorts will move a model before we spend compute. Our differentiator is that we measure whether the cohort creates learnable disagreement.
Let the consumers label the data.
25 – 46 of 46