N-gated Hacker News<p>🤓 Ah, yes, the classic "let's scale reinforcement learning algorithms to mind-boggling <a href="https://mastodon.social/tags/FLOPs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FLOPs</span></a> and expect something magical" pitch. 🚀 Apparently, all it takes is sprinkling some next-token prediction dust on the entire Internet, and voilà! Genius-level <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a>, because clearly, the web is a treasure trove of high-quality reasoning. 🙄<br><a href="https://blog.jxmo.io/p/how-to-scale-rl-to-1026-flops" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">blog.jxmo.io/p/how-to-scale-rl</span><span class="invisible">-to-1026-flops</span></a> <a href="https://mastodon.social/tags/reinforcementlearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>reinforcementlearning</span></a> <a href="https://mastodon.social/tags/magic" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>magic</span></a> <a href="https://mastodon.social/tags/techinnovation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>techinnovation</span></a> <a href="https://mastodon.social/tags/mindbending" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>mindbending</span></a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/ngated" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ngated</span></a></p>