Excited by the SOTA results, but until xAI proves its guardrails, “most powerful” could just as easily mean “most risky”—any sign of independent red-team reviews?
Jaw-dropping scores aside, I’m curious how xAI will convince skeptics that Grok 4 won’t repeat Grok 3’s harmful outputs. Are they publishing any safety data?
The numbers behind Grok 4 are wild, yet it feels like déjà vu after Grok 3’s backlash—does xAI have a concrete plan to prove its guardrails are stronger this time?
Excited by the SOTA results, but until xAI proves its guardrails, “most powerful” could just as easily mean “most risky”—any sign of independent red-team reviews?
Grok 4 edging past Gemini and o3 is headline-worthy, but can xAI show the same rigor in ethics disclosures that it shows in benchmarks?
$30 a month is tempting, yet I hesitate after last round’s toxicity issues—will paying users get visibility into model updates and fixes?
The Colossus supercomputer clearly delivers power, but where’s the parallel investment in public-facing safety reports?
Performance is one thing, permission to operate is another; has xAI outlined how it’ll prevent a repeat of Grok 3’s antisemitic slip-ups?
If Grok 4 can “unlock new physics,” transparency should be table stakes—does anyone know whether xAI will open-source the evals?
Love the 128K context window, but without clearer governance it’s hard to cheer—when will xAI let outsiders look under the hood?
Grok 4 Heavy sounds like a powerhouse, but trust still seems the heaviest lift—have they shared who’s auditing these multi-agent systems?
Jaw-dropping scores aside, I’m curious how xAI will convince skeptics that Grok 4 won’t repeat Grok 3’s harmful outputs. Are they publishing any safety data?
The numbers behind Grok 4 are wild, yet it feels like déjà vu after Grok 3’s backlash—does xAI have a concrete plan to prove its guardrails are stronger this time?