automated rubrics for evaluating llms in education
elmes* builds fine-grained rubrics to assess how large language models teach, not just what they know, across 330 long-tail educational scenarios.
aiplain 200-word summaries of important ai and data science news, updated every few hours.
elmes* builds fine-grained rubrics to assess how large language models teach, not just what they know, across 330 long-tail educational scenarios.
aimicrosoft's github copilot moves to per-token pricing, signaling a broader shift as ai companies face pressure to pass real costs to users.
aiopenai plans a chatgpt super app, apple revamps siri with gemini, and the white house eyes an equity stake in openai.
aiopenai pushes chatgpt into super app territory, google adds ai to thrift shopping, and apple preps a siri overhaul with gemini.
aiopenai is revamping chatgpt into a super app with coding tools and ai agents to compete with anthropic and drive paid subscriptions.
aigoogle search adds ai mode, lens, circle to search, virtual try-on, and resale tools to help thrift shoppers find, evaluate, and sell vintage items.
aihackathon participants report issues activating openai codex vouchers, with no clear entry point for the key, while modal vouchers were resolved.
ailearn to write, append, and save text, csv, and json files in python using built-in tools.
aiopenai introduces lockdown mode to limit data exposure from prompt injection attacks by disabling live browsing and other features.
aiapple's wwdc 2026 will feature a major siri revamp using google gemini, an ai agent app store, and new visual intelligence tools.
aiformer tech executive sriram krishnan will step down as white house ai advisor at the end of june after 18 months shaping trump administration ai policy.
aia hackathon project runs a multi-agent economy where each creature uses a different lab's small model, with the player as a financier manipulating the market.
aithe ladybird browser project will no longer accept public pull requests, citing concerns that ai-generated code undermines trust and accountability.
aithe trump administration is discussing taking an equity stake in openai, potentially seeding a public wealth fund to share ai profits with citizens.
ainew research shifts focus from behavior to internal mechanisms when assessing consciousness in animals and ai, finding current ai likely not conscious but leaving door open for insects and future machines.
aigitco improves zero-shot forecast accuracy by filtering harmful patches from input context without retraining.
aia new python package runs micropython inside a wasm sandbox for safe code execution with memory and cpu limits.
ailearn how to speed up spacy pipelines and improve entity recognition with selective loading, batch processing, and hybrid rule-based methods.
ai