Max @Max__Web
NLProc and social stratification • Text as data Joined February 2020-
Tweets35
-
Followers117
-
Following731
-
Likes330
Run LLMs locally in R with the rollama R package. The latest update simplifies query generation (Prompt) and adds support for Hugging Face models (@JohannesBGruber) Check it out: medium.com/@weber.aca/rol…
Lol, jfc. 1. People write a paper to critique a scale. 2. It gets 1250 cites 3. Turns out almost every one of those used the paper as SUPPORT for the scale link.springer.com/article/10.100…
Yea, compbio folks beware: Anaconda is now trying to get academic users to pay licensing fees. See this Reddit thread too: reddit.com/r/bioinformati…
uhh, anaconda just sent a message to our HPC admins that we're in violation their ToS and we now need to pay for a license or remove all their software from our system?
For the few weeks, we've been working on developing a way to compare the efficiency of AI models on different tasks: we call it the Energy Star AI project! ⭐️⚡️ huggingface.co/blog/sasha/ene… Looking forward to hearing your thoughts and ideas - reach out if you want to collaborate🤗
This paper seems very interesting: say you train an LLM to play chess using only transcripts of games of players up to 1000 elo. Is it possible that the model plays better than 1000 elo? (i.e. "transcends" the training data performance?). It seems you get something from nothing, and some information theory arguments that this should be impossible were discussed in conversations I had in the past. But this paper shows this can happen: training on 1000 elo game transcripts and getting an LLM that plays at 1500! Further the authors connect to a clean theoretical framework for why: it's ensembling weak learners, where you get "something from nothing" by averaging the independent mistakes of multiple models. The paper argued that you need enough data diversity and careful temperature sampling for the transcendence to occur. I had been thinking along the same lines but didn't think of using chess as a clean measurable way to scientifically measure this. Fantastic work that I'll read I'll more depth.
Excited to announce the launch of #rollama, an #Rstats package that brings the power of generative LLMs to your R environment! Wrap the @ollama API for free, private, and reproducible use of genAI. Check out our (@JohannesBGruber) paper: arxiv.org/abs/2404.07654
💬1 week left to register for our Computational Social Science workshop on May 6th and 7th at Bielefeld University📢It features an amazing line-up of great speakers and we look forward to many inspiring discussions👉Registration details and programme here computational-social-science.org
@orioljbosch Good points. The first one can be resolved by using a seed with a model that can be downloaded. For example, using zephyr-7b-beta with a seed will give consistent outcomes.
🔑 Conclusions: 1. Importance of validation in text annotation with LLMs 2. Advantages of open models for data privacy 3. Reproducibility through downloadable models and predefined seeds
🧠 Key Insights: - Challenges of proprietary LLMs in social sciences - Advocacy for open models for better reproducibility & privacy - Case studies: Sentiment analysis in tweets & leisure activities in childhood essays - Performance evaluation of various open models
📢 New Preprint Alert! 🚀 🔍 "Evaluation is all you need. Prompting Generative Large Language Models for Annotation Tasks in the Social Sciences. A Primer using Open Models" by Maximilian Weber & Merle Reichardt 🔗 Read the full paper: arxiv.org/abs/2401.00284
The pace of open-source LLM innovation and research is breath-taking I suspect that open-source will soon become unbeatable for anyone except maybe OpenAI Here's why - Open-source community is way bigger than any specific company - Safety lobotomy and fear of bad press will continue will impact proprietary model performance - Smaller models that are instruct / fine-tuned are performing as well as 50x bigger models - Smaller models are more efficient and cheaper than large models - Companies will leverage open-source and offer value-added services and APIs
arxiv.org/abs/2310.06825 Mistral 7B paper is up on arxiv. The authorship order is alphabetical. Please cite with author = {Mistral AI} 🙂
📡The next term of the #TaDa Speaker Series starts next week!! We have six amazing speakers on all things #TextAsData/#NLP lined up for you 👇
How Can Open Source LLMs catch up to GPT-4V and Google's Gemini? Open-source LLMs are getting really good. However, they are not as powerful as GPT-4 right yet. Plus, mutlimodal models like GPT-4V and Google's Gemini will be dropping soon... making it even harder for open-source models to catch up to these closed APIs The trillion dollar question is, can open source models close the gap to closed source models? The answer is a cautiously optimistic - Probably! Here is how I see the open-source ecosystem evolving and catching up to SOTA closed source models over the next 12-18 months. Training multimodal open-source models - We at Abacus are working open-source multimodal models and I am hopeful that all other open-source research labs are doing the same thing. The good news is there is a clear path to building these and you can start with a pre-trained models like Llama-2 A common approach is to start with a pre-trained language model and then fine-tune it on multimodal tasks using a dataset that contains multiple types of data such as text, images, and/or audio. During this fine-tuning process, the model learns to relate information across different modalities, improving its performance on the target multimodal tasks. For instance, you might fine-tune a pre-trained language model using a dataset of images and associated captions to create a model that can generate descriptive text for new images it encounters. For example, Google used this technique by extending the initial PaLM language model. PaLM-E was enriched with sensor data from robots. This transitioned PaLM into a multimodal model, PaLM-E, capable of handling diverse tasks across robotics, visual, and language domains. Mimic-ing mixture of experts: Rumors suggest that GPT-4 may utilize a Mixture of Experts (MoE) architecture, where multiple smaller models, each specialized in different tasks, collaborate to process data. This setup allows the handling of a vast number of parameters more efficiently by distributing them across these "experts". It's speculated that such an architecture could help GPT-4 manage a more diverse range of tasks and data, scaling up its capacity and capability while controlling computational and memory demands. Mimic-ing this structure with open-source models is not hard. You can always instruct tune open-source models to be very good at a particular tasks and you then use multiple models each instruct tuned for a specific task to "collaborate" to answer queries. Leaderboards and benchmarks We already have a bustling open-source community with a number of LLM benchmarks including MT-bench where you can easily measure how your LLM compares to others. Open-source labs and developers are engaged in a constant race to beat SOTA open-source models. The community has already caught up to 3.5 and GPT-4 when fine-tuned for a particular task. The benchmarks are only going to get more robust as more and more open-source models are dropped New more powerful open-source models A number of companies including Meta, have committed to training next generation multimodal LLMs and open-sourcing them. Smarter AI alignment and RLHF - Open source models don't have as much scrutiny as big tech. This means that they don't need as much safety lobotomy as big tech closed-source APIs. The safety lobotomy has a harmful side effect of killing legitimate queries. For example Llama-2 refused to answer the query "how to kill a linux process" citing 'safety reasons'. Mistral-7B however answered that question correctly All this means that eventually open-source will likely catch up to closed source models. Over time, more efficient smaller models will match performance of larger models, reducing the need to have thousands of GPUs Already the Mistral-7B models beats the 13B Llama-2. We will continue to see such improvements on a on-going basis. At some point the law of diminishing returns will kick in for Google and OpenAI, unless there is a significant breakthrough in NNs and AI tech - i.e. just throwing more compute or data at the problem won't necessarily dramatically improve performance. This means that while open-source models may still be a year away from GPT-4, they will start closing the gap quickly!
LOL I guess this is happening
Researcher: Uses #crowdsourcing to evaluate LM output. Crowdworker: Uses #ChatGPT to produce labels. Researcher: Spends hours using #ChatGPT to figure out if crowdworker was cheating. Quo vadis, #NLProc ?
📢The Call for Participation for this year's summer school on the computational social science 💻of the democratic debate 🗨️🗯️🗳️is live now! Find it here: bigssscss.janlo.de (Deadline Apr 27). Note: A few more projects may appear in the coming week.
Alsuxu @Alsuxu126
20 Followers 986 Following
Ada Wan @adawan919
270 Followers 2K Following Transdisciplinarian (stats, datasci, ml, lang/socSci, tech, art, science, philosophy). (Use-inspired) fundamental research.Opinions my own. Accidental activist.
Olga Zagovora @alenyshkaxx
249 Followers 352 Following Data Scientist/Postdoc in Computational Social Science @RPTU , former @gesis_org PhD student
Doan Nam Long Vu @doannamlongvu
13 Followers 30 Following PhD Student at the MAIN associated with @UKPLab, @CS_TUDarmstadt @TUDarmstadt, Germany Working on AI and NLP for mental health
Stephan Poppe @poppe_stephan
999 Followers 2K Following Moved emotionally by Statistics - 24 HR. Statistical Services: Lecturer of Statistics @UniLeipzig, Crowd Counter @durchgezaehlt, Former Theoretical Physicist
Giuseppe (Peppe) Russ... @russogiusep
464 Followers 555 Following Research Scientist @google working on AI-Safety, Synthetic Environments. Prev @EPFL and @Stanford
Mouli Maity @MouliMaity4
25 Followers 722 Following
Eva Maria Vecchi @emvecchi
184 Followers 404 Following NLP Researcher @ims_stuttgart Computational Argumentation, Bias, Meaning, e-Deliberation, Cognitive Modeling, #NLProc methodology
@damiantrilling@akade... @damian0604
2K Followers 1K Following Professor of Journalism Studies, VU Amsterdam
Expected Parrot (YC F... @ExpectedParrot
2K Followers 1K Following Launch surveys and interviews with AI and humans, all in one place.
Bradley J. Baker @BradleyJBaker
846 Followers 1K Following Associate Professor, Temple University, Sport Management. Consumer Behavior, Social Media, Analytics, Esports, Machine Learning, and Meta-Science. (he/him)
Lauren Leek @leek_lauren
718 Followers 1K Following Politics, Data & Political Economy - visiting @LSEnews & PhDing @EUI_EU @eui_sps, former PhD trainee @ecb
Verena Kunz @anerevznuk
910 Followers 695 Following PhD candidate @GESSuniMannheim & Scientific coordinator @gesistraining. She/her. Comparative politics, legislative behaviour, computational social science.
Christian Pipal @christianpipal
958 Followers 1K Following Researcher at UZH. interested in how politicians and influencers talk (and dance) about politics. PhD from @HotPoliticsLab. prev Wien & Tartu. immer Rapid Wien.
Lena Masch @LenaMasch
1K Followers 2K Following political scientist| political psychology | emotions | images & text | she/her #firstgen 🐈⬛🐈⬛lady 📷: Tobias Koch
Fabio Votta📊🐧 @favstats
5K Followers 5K Following Computational Communication Scientist #rstats | Postdoc @UvA_ASCoR | Blog: https://t.co/tUQ2VstbXu | Running on plant-based fuel 🌱 | he/him | 🇮🇹 🇩🇪 in 🇳🇱
Marcus Torres @MarcusTorres_
841 Followers 2K Following PolSci Ph.D. student @politicaufpe| Researcher at @ipeaonline
Hartmut Esser @EsserHartmut
3K Followers 426 Following Prof (em) for Sociology and Philosophy of Science (University of Mannheim). Main interest: Unified-Explanative-Analytic-Empirical (Social) Sciences.
Benjamin Arold @BenjaminArold
2K Followers 1K Following Assistant Professor, Faculty of Economics, @cambridge_uni. Before @ETH, @LMU_Muenchen, @ifo_Institut. Economics of Labor, Education, Religion, and AI
School of Social Scie... @tcd_school_ssp
773 Followers 995 Following The School of Social Sciences and Philosophy has suspended use of X. For news and updates, please visit our website or follow us on Bluesky and LinkedIn.
Dorian Tsolak @doriantsolak
151 Followers 277 Following PhD candidate in Sociology @ Bielefeld University Research in Computational Sociology | Comp. Social Science about Stereotypes | Racism | (female) Migration
Stefanie Walter @SteWalter
1K Followers 1K Following You can find me here: @stefaniewalter.bsky.social | Prof & Emmy Noether Fellow @TU_Muenchen | #citizens #diversity #EU #climatechange #news #textanalysis |
Oriol Bosch, PhD @orioljbosch
1K Followers 1K Following Senior Research Scientist @Prolific | PhD from @LSEMethodology | Previously: @UniofOxford @NuffieldCollege @UPFBarcelona. Views my own.
Mike Burnham @ML_Burn
951 Followers 652 Following Assistant professor @tamupols, Ph.D. @psupolisci & @CSoDA_PSU, postdoc @Princeton_CSDP. Text analysis & deep learning, methods, American politics.
Nicolai Berk @BerkNicolai
234 Followers 306 Following legacy account. now tweeting under @nicolaiberk
CLIC @clic_iwg
1K Followers 780 Following Comparative Life Course & Inequality Research Centre at @EUI_sps Prof: Herman van de Werfhorst & @JuhoHarkonen. Page admin: @GaiaGhirardi @trackjannik
Donya Rooein @donyarooein
257 Followers 936 Following Postdoc @MilaNLProc @Unibocconi. Work in Conversational AI, Large Language Models and Education
The COMPTEXT Associat... @COMPTEXTCONF
580 Followers 561 Following The COMPTEXT Association is an international community and forum for text/image/video-as-data scholars.
Hirotaka Fujibayashi @hirofujibayashi
131 Followers 414 Following Postdoc/MW Fellow @EUI_EU | PhD in IR/PS @GVAGrad | Studying (comparative) politics of immigration and refugee reception
Dirk Wulff @dirkuwulff
1K Followers 1K Following Senior scientist @mpib_berlin and @unibasel_en | language models, decision science, and sustainability (https://t.co/G46Ym2eo37).
Germans Savcisens @Ne... @germansave
260 Followers 445 Following Postdoc @NUnetsi (@KhouryCollege) 👾 ✨ work on epistemic stability of LLMs🌿 my plants call me daddy 🦄 he/him 🇱🇻🇺🇦 https://t.co/GSjxm4ymmV
Jihed Ncib @JihedNcib
1K Followers 3K Following Machine Learning Researcher (Post-doctoral) at the Connected Politics Lab and the School of Computer Science (University College Dublin).
Dag Asheim @DagAsheim
126 Followers 5K Following
Carlos J. Gil @KarlosJ89
1K Followers 2K Following Assistant Professor @UNI_FIRENZE • PhD @EUI_EU • Inequality
Roujman Shahbazian @roujman
287 Followers 399 Following Sociologist @uppsalauni, @SOFI_su_se, & @LMU_Muenchen. Social inequality and mobility. Persimmon fan.
Philine Widmer @phinifa
921 Followers 672 Following Researching topics around the media, AI, and politics.
Soziologisches Instit... @sociology_uzh
921 Followers 1K Following Department of Sociology @UZH_en🇨🇭| Soziologisches Institut der @UZH_CH (#SUZ) | 🔎#socialnorms & #cooperation, #lifecourse & #generations, #economicsociology
Younghyun Lee @younghyunlee52
151 Followers 321 Following PhD Candidate @IllinoisPolSci | immigrant integration, identity politics, political psychology, European politics | she/her/hers https://t.co/AnVZpoP3aY
𝐷𝑖𝑛𝑜.𝐶... @JustaNormalDino
786 Followers 364 Following Postdoc at ETHz - Ass. Editor @HSScomms Exploring Opinion dynamics and Collective intelligence with social simulations and belief networks.
Clemens Kroneberg @c_kroneberg
909 Followers 272 Following Sociology | Diversity, Crime, Networks, Action Theory | Professor @ISS_UniCologne @ECON_tribute | Principal Investigator @socialbond_ERC | Views: own
Eleonora Vlach @EleonoraVlach
281 Followers 155 Following Post-doctoral researcher! Team: Social Stratification and Social Policy, @goetheuni Frankfurt (DE). Deputy Editor at @ESR_news
Martina Schories @MSchories
826 Followers 589 Following research in internet histories @RuhrUniBochum, data & design @sfb1472 @UniSiegen, former data journalist @SZ @[email protected]
Benita Combet || @be... @benitacombet
900 Followers 1K Following Educational inequality & gender inequality in the labour market. Panel data & experiments. Working at @unibern. she/her. 🏳️🌈
Laura Eberlein (laura... @laura_eberlein
107 Followers 286 Following PhD Candidate @VUamsterdam | interested in labour market inequalities and school to work transitions
Tilman Beck @devnull90
368 Followers 1K Following Clinical Machine Learning at University Hospital Zurich / He, Him
OpenEuroLLM @OpenEuroLLM
385 Followers 5 Following A series of foundation models for transparent AI in Europe
Ada Wan @adawan919
270 Followers 2K Following Transdisciplinarian (stats, datasci, ml, lang/socSci, tech, art, science, philosophy). (Use-inspired) fundamental research.Opinions my own. Accidental activist.
Brett Adcock @adcock_brett
531K Followers 21 Following @figure_robot (AI robots) @hark_labs (personal AGI) @cover_thz (weapon detection) @flyArcher (flying cars)
Political Methodology @PolMethSociety
3K Followers 622 Following Official twitter for the Society for Political Methodology
Adam.GPT @TheRealAdamG
36K Followers 5K Following Alleged vague-poster. Enterprise token slinger and other GTM things at @OpenAI. A fan of NY sports, tech, memes & nice people. My opinions are my own.
Robert Vief @robertvi... @RobertVief
1K Followers 1K Following As this platform is owned by a right-wing extremist, I am no longer active over here. BUT, see you there: https://t.co/10wnzoF0xo
Diego @DVM2895
14 Followers 389 Following PhD candidate @EUI_ECO | Previously @ecb, @UvA_Amsterdam, @uc3m
Taegyoon Kim @TaegyoonK
726 Followers 781 Following AP at School of Digital Humanities and Computational Social Sciences & Graduate School of Data Science, KAIST | Previously @KelloggCSSI @CSoDA_PSU @psupolisci
Nhat An Trinh @NhatAnTrinh
811 Followers 589 Following Sociologist @INETOxford @DSPI_Oxford @NuffieldCollege | research on inequality, social mobility, class, wealth | she/her ➡️@natrinh.bsky.social
Marcel Fratzscher @MFratzscher
70K Followers 933 Following President of @DIW_Berlin, Professor at Humboldt University, and columnist Die Zeit. My new book: „Nach uns die Zukunft - Ein neuer Generationenvertrag…“
Taha Yasseri @TahaYasseri
7K Followers 456 Following Workday Full Prof & Chair of Technology & Society @tcddublin & @WeAreTUDublin. Director of the Centre for Sociology of Humans & Machines @SOHAM_Centre.
Dan Hendrycks @hendrycks
45K Followers 116 Following
AI at Meta @AIatMeta
806K Followers 323 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
Jack Morris @jxmnop
51K Followers 1K Following research // language models, information theory, science of AI // formerly @cornell
Ilir Aliu @IlirAliu_
52K Followers 749 Following If it matters in European AI and Robotics, you'll see it here first
Berkeley AI Research @berkeley_ai
273K Followers 458 Following We're graduate students, postdocs, faculty and scientists at the cutting edge of artificial intelligence research.
Protzko @protzko.bsky... @JProtzko
1K Followers 223 Following @protzko.bsky.soci cognitive development, social cognition, & metascience. Using Twitter to help get high quality research to the world.
FReDA Panel | @FReDA@... @FredaPanel
696 Followers 278 Following #FReDA - Das familiendemografische Panel // The German Family Demography Panel Study Imprint: https://t.co/jxIK8NzDGH @bib_bund @gesis_org
Tilo Jung @TiloJung
285K Followers 2K Following Belangloser Videojournalist & Seltsamfrager | Creator of @JungNaiv | 💡🌍
Philipp Koch @philippmkoch
853 Followers 678 Following Economist | Head of Data Science @Eco_Austria | Affiliate Member and PhD from @LearningCCL @UT1Capitole | Lecturer @WU_Econ
Stephan Poppe @poppe_stephan
999 Followers 2K Following Moved emotionally by Statistics - 24 HR. Statistical Services: Lecturer of Statistics @UniLeipzig, Crowd Counter @durchgezaehlt, Former Theoretical Physicist
CEPR @cepr_org
68K Followers 106 Following Centre for Economic Policy Research, founded 1983. Network of over 1900 economists. See @voxeu https://t.co/mncEmo0sds https://t.co/DCTfP3XFFa…
Universität Mainz @uni_mainz
14K Followers 803 Following Offizieller Account der #UniMainz. Wir sind hier nicht mehr aktiv. Andere Kanäle & Impressum unter https://t.co/WWvXFDJYQz
Matt Shumer @mattshumer_
368K Followers 2K Following Investor in @GroqInc @Etched @Rork @DaytonaIO @OpenRouter + more. Prev: CEO @HyperWriteAI, @OthersideAI Try: https://t.co/gupFzizHCd Press: [email protected]
Pablo Jost - deactiva... @pbjost
2K Followers 716 Following Communication scholar @ifp_mainz | visiting professor @HannoverIJK | (social) media, journalism, politics | @pjost.bsky.social
Mark Schieritz @schieritz
21K Followers 1K Following
Carolin Amlinger @CAmlinger
7K Followers 1K Following Nicht mehr aktiv. Account bleibt für Recherchen bestehen.
xAI @xai
2.0M Followers 5 Following
Gary Marcus @GaryMarcus
227K Followers 7K Following OG GenAI Skeptic; spoke at US Senate. Warned about hallucinations in 2001. Advocating world models & neurosymbolic AI ever since. Author, Marcus on AI & 6 books
Alejandra Caraballo i... @Esqueer_
141K Followers 2K Following No longer on this hellsite. Catch me on bluesky.
YouGov @YouGov
273K Followers 454 Following YouGov gathers real opinions and behaviours from worldwide panel members to power insights into what the world thinks. Join now: https://t.co/vWG7kjsRtR
Oskar van der Wal @oskarvanderwal
315 Followers 403 Following Technology specialist at EU AI Office / AI Safety / Prev: @AmsterdamNLP @AiEleuther Thoughts & opinions are my own and do not necessarily represent my employer
philipp lorenz-spreen @lorenz_spreen
1K Followers 840 Following I am heading the group “Computational Social Science” within the Center Synergy of Systems and Scalable Data Analytics and Artificial Intelligence at TU Dresden
Mario Giulianelli @glnmario
1K Followers 1K Following Associate Prof @ucl - Member of @ELLISforEurope | Language and AI Science | Prev. senior research scientist @AISafetyInst, postdoc @ETH_en, PhD @illc_amsterdam
Kamala Harris @KamalaHarris
21.0M Followers 701 Following Always fighting for the people. Wife, Momala, Auntie. She/her. 107 Days available now.
Giuseppe (Peppe) Russ... @russogiusep
464 Followers 555 Following Research Scientist @google working on AI-Safety, Synthetic Environments. Prev @EPFL and @Stanford
Friedolin Merhout @fmerhout
941 Followers 1K Following Sociologist & Social Data Scientist @uni_copenhagen. @DukeSociology PhD. He/him/his. Under clearer skies: https://t.co/rDRivXH7XN
Timo Sprang @timo_sprang
47 Followers 116 Following Doctoral Researcher @goetheuni, Institute of Political Science.
Tiago Ventura @_Tiagoventura
5K Followers 1K Following Assistant Professor at @McCourtSchool @Georgetown Working on computational social science, social media, and politics. De Belém 🇧🇷
Vinay Hiremath @vhmth
46K Followers 106 Following currently: vibing with drones, previously: co-founder @loom, mechatronics intern @specter
Kevin Saukel @SaukelKevin
94 Followers 147 Following Irgendwie in Wissenschaft, Zivilgesellschaft und Wirtschaft tätig | #DigitalChangeMaker 2020/2021 @HFDdigital | @visionale_he
ProLOEWE @ProLOEWE
920 Followers 548 Following ProLOEWE. Netzwerk der LOEWE-Forschungsvorhaben LOEWE Research Initiatives Network






















