CEO & Co-Founder @QuotientAI ✨ formerly @GitHub @GitHubCopilot 🤖 reformed physicist 👩🔬 ~ opinions are my own ~quotientai.co/post/hello-wor… Boston, MAJoined March 2013
Feeling like your RAG model is just... okay?
Let’s change that! 💡
Step beyond the ordinary and join us on May 7th for an exclusive webinar hosted in collaboration with @QuotientAI.
Register in the link bellow 👇 buff.ly/4doMlwn
Are models overfit on public benchmarks or are public benchmarks so generic and representative of the general training corpus that regular training looks like overfitting 🤔 “overfitting” implies some intent/error during training, which may not be the case.
Are models overfit on public benchmarks or are public benchmarks so generic and representative of the general training corpus that regular training looks like overfitting 🤔 “overfitting” implies some intent/error during training, which may not be the case.
💡 Prompt optimization reveals different dominant models for domain-specific tasks.
@Meta Llama3 performs best w/ generic prompt but @OpenAI takes the lead with more tailored prompt.
👆 This means that improving quality is an iterative process not a one-time model choice!
We benchmarked @Meta Llama3 @databricks DBRX @MistralAI Mixtral-8x22b and @OpenAI GPT4 on our Product Catalog Q&A dataset!
1⃣ Llama-3-70b matches GPT4's pace!
2⃣ Llama-3-8b and Mixtral-8x22b have almost identical performance.
3⃣ DBRX is definitely the chattiest of the bunch 😉
@kevinroose There are two separate eval challenges that often get conflated: #1 LLM providers benchmarking their models and #2 AI companies evaluating LLM systems. #1 needs the same out-of-sample yard stick to show relative improvements, #2 needs domain specificity and customization.
This is a very important point that most people don’t realize — if you have a multi-step process with an error rate at each step, those error rates compound. The solution is to either improve the reliability or catch and correct the errors on the go — ideally both.
This is a very important point that most people don’t realize — if you have a multi-step process with an error rate at each step, those error rates compound. The solution is to either improve the reliability or catch and correct the errors on the go — ideally both.
Hello world 👋 excited to announce what @JuliaANeagu and I have been working on -- @QuotientAI
Quotient enables developers to evaluate, improve and ship high quality AI products through fast, real-world, data-backed experimentation.
1/3
That’s because most implementations of LLM-as-a-judge don’t make sense for what they’re supposed to evaluate. Ideally you can always use a mix of human, heuristic, and LLM eval. There are also tasks that are too unstructured for heuristics but structured enough for good LLM-eval.
That’s because most implementations of LLM-as-a-judge don’t make sense for what they’re supposed to evaluate. Ideally you can always use a mix of human, heuristic, and LLM eval. There are also tasks that are too unstructured for heuristics but structured enough for good LLM-eval.
AI progress is starting to bottleneck on good data before it bottlenecks on compute.
@freddie_v4 built an interface that enabled a global team of volunteers create the datasets behind @CohereForAI’s Aya.
AI progress is starting to bottleneck on good data before it bottlenecks on compute.
@freddie_v4 built an interface that enabled a global team of volunteers create the datasets behind @CohereForAI’s Aya.
18K Followers 1K FollowingVP of DevRel for @GitHub. Previously Executive Director @dotnetfdn and original creator of the @Microsoft org on @GitHub (he/him)
2K Followers 354 FollowingSr. Manager, International DevRel @github. Ardent fan of Dr. Kalam. Avid book reader. Tweets about tech, life, books & dev jokes. Opinions are personal.
2K Followers 2K FollowingSenior Director of PM - Security @GitHub; formerly PM @Microsoft. Mostly tech, security, Star Trek 🖖🏻, with a sprinkle of far left political outrage.
3K Followers 4K FollowingLearning Machine Learning...came for the bants, stayed for the rants.
| Growth ML Eng @weights_biases | ex-Facebook Safety | https://t.co/a7i7G5dkLG | 🇮🇪
606 Followers 4K FollowingProgrammer/builder seeing world through lens of climate emergency, yet thrilled by the positive aspects of AI and general purpose technology
@[email protected]
2K Followers 5K FollowingTeaches and does Stats, ML and AI. Co-Founder and Chief Scientist https://t.co/EygMgQHg07. Former Lecturer at Harvard and Astrophysicist at Penn. Bayesian.
7K Followers 931 FollowingWe are a Pre-seed, Seed, and Series A firm, backing founders with the in-depth support they need at the earliest stage of their journey.
3K Followers 3K FollowingData & Technology professional at @ActianCorp. See more at https://t.co/fprxfz6xJ2.
#CloudDataWarehouse #HybridData
Tweets: my own.
14K Followers 1K FollowingML Engineer (e/acc)
📌 https://t.co/x0IIWfnOt8
🚀 https://t.co/QEO4CKRl1b
Open LLMs is Happiness 💡
Ex Deutsche & HSBC.
DM for collaboration.
266 Followers 423 FollowingDigital Marketing Agency | SEO | PPC | Social Media | Web Development | Automation | CRM | Analytics | AI Digital Solutions.
1 Followers 342 FollowingThe team is a professional team that provides short-term return on investment. With a strict plan, you can make $1,000 to $10,000 a day.
7K Followers 2K Following🗜ML Engineer
🛠️ Building Multi-Modal Models @AbideAI
🖋 Writing #LLMOps - Managing LLMs in Production book (O'Reilly)
🐦 Talk to me about LLMs, MLSys & SecOps
40 Followers 271 Following⚡️ Data • SQL • Data Engineering 👨💻 Big Data • Databricks . BI Architect ☁️ Microsoft Azure Certified Data Engineer #dataengineering #dataengineer
18K Followers 1K FollowingVP of DevRel for @GitHub. Previously Executive Director @dotnetfdn and original creator of the @Microsoft org on @GitHub (he/him)
13K Followers 1K FollowingVP of Eng @firehydrant, photographer, ex @github @twilio trust & safety @consentsoftware @feerlessapp. She/her. Opinions my own 💛
2K Followers 354 FollowingSr. Manager, International DevRel @github. Ardent fan of Dr. Kalam. Avid book reader. Tweets about tech, life, books & dev jokes. Opinions are personal.
284K Followers 3K Followingstupid like a fox • Director of Eng- Web, iOS, Android & Multiplat Infra @google, O'Reilly Author • https://t.co/HhzYWwxqL9, https://t.co/SOjL0RPUNN she/her BLM
528 Followers 294 Followingstaff engineer @github👩🏻💻 mom to Ari and Eli 👦🏻👶🏻 can be bribed with coffee and/or sweet treats ☕️ 🍪 born in 🇸🇬 @CarnegieMellon alum
427 Followers 17 FollowingAutoblocks AI is a cloud-based workspace that enables product teams to collaboratively evaluate, test, and improve their GenAI/LLM products.
587 Followers 54 FollowingLearn how to build the best AI products 🚀 Insights from leaders at companies like @OpenAI, @Intercom, @Retool & @Airtable ⚡ Podcast, newsletter, and more! 🪄
7K Followers 931 FollowingWe are a Pre-seed, Seed, and Series A firm, backing founders with the in-depth support they need at the earliest stage of their journey.
232K Followers 3K Following@NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.
12K Followers 1K Following“All methods are sacred if they are internally necessary” (GP @amplifypartners, prev @canvasvc; Head of Data @Mattermark; @palantirtech; @c4ads)
161 Followers 4 FollowingStop manually testing your LLM application and breaking production. Start deploying with confidence. Get started at https://t.co/KcO20Vw2MH
7K Followers 2K Following🗜ML Engineer
🛠️ Building Multi-Modal Models @AbideAI
🖋 Writing #LLMOps - Managing LLMs in Production book (O'Reilly)
🐦 Talk to me about LLMs, MLSys & SecOps
241 Followers 29 FollowingOur mission is to build, commercialize, and open source components and tools that make it easy for developers and users to develop AI agents to solve real-world
2K Followers 1K FollowingReporter @Forbes covering AI and startups. I write The Prompt, a @Forbes AI Newsletter / Alumna @mujschool/ send tips to [email protected].
6K Followers 2K FollowingCEO and Co-founder @datologyai working to make it easy for anyone to make the most of their data. Former: RS @AIatMeta (FAIR), RS @DeepMind, PhD @PiN_Harvard.
7K Followers 74 FollowingOpen-Source Vector Search Engine and Vector Database written in Rust https://t.co/nkLUmsgxhV 🦀 Also available in the cloud https://t.co/wBitjnGWgi ⛅
2K Followers 349 FollowingContent Creator. Ex organizer of DOD SEA and BOS and SLS Days BOS. ex cohost Devops&Docker Talk. 1st evangelist @ DDOG, ex OTEX, MSFT, Ollama. https://t.co/eKNM3rP5Jw
9K Followers 356 FollowingPrincipal Scientist @ Google DeepMind
Work on Gemini 💎♊
Compression is all you need
LLMs (e.g. Gopher, Chinchilla, Gemini)
💼 Past: OpenAI, Quora
505K Followers 65K FollowingFollow me on my new podcast with AI startups, Unaligned. Tech industry color commentator since 1993. Author/Blogger. Former strategist @Microsoft.
13K Followers 415 FollowingAccelerating humanity's transition to AGI & honoring the greatest AI founders and researchers of our time @ https://t.co/1lJUc58gZJ
No recent Favorites. New Favorites will appear here.