nerdculture.de is one of the many independent Mastodon servers you can use to participate in the fediverse.
Be excellent to each other, live humanism, no nazis, no hate speech. Not only for nerds, but the domain is somewhat cool. ;) No bots in general. Languages: DE, EN, FR, NL, ES, IT

Administered by:

Server stats:

1.2K
active users

#LLMs

36 posts33 participants2 posts today

🔍 Large-Scale Text Analysis & Cultural Change

In their talk at the workshop “Large Language Models for the HPSS” @tuberlin Pierluigi Cassotti and Nina Tahmasebi presented a multi-method approach to studying cultural and societal change through large-scale text analysis.

By combining close reading with computational techniques, including but not limited to #LLMs , they demonstrate how diverse tools can be integrated to uncover shifts in language. #DigitalHumanities

"In a new joint study, researchers with OpenAI and the MIT Media Lab found that this small subset of ChatGPT users engaged in more "problematic use," defined in the paper as "indicators of addiction... including preoccupation, withdrawal symptoms, loss of control, and mood modification."

To get there, the MIT and OpenAI team surveyed thousands of ChatGPT users to glean not only how they felt about the chatbot, but also to study what kinds of "affective cues," which was defined in a joint summary of the research as "aspects of interactions that indicate empathy, affection, or support," they used when chatting with it.

Though the vast majority of people surveyed didn't engage emotionally with ChatGPT, those who used the chatbot for longer periods of time seemed to start considering it to be a "friend." The survey participants who chatted with ChatGPT the longest tended to be lonelier and get more stressed out over subtle changes in the model's behavior, too."

futurism.com/the-byte/chatgpt-

Futurism · Something Bizarre Is Happening to People Who Use ChatGPT a LotBy Noor Al-Sibai

Happy birthday to Cognitive Design for Artificial Minds (lnkd.in/gZtzwDn3) that was released 4 years ago!

Since then its ideas have been presented and discussed widely in the research fields of AI/Cognitive Science/Robotics and - nowadays - both the possibilities and the limitations of: #LLMs, #GenerativeAI and #ReinforcementLearning (already envisioned and discussed in the book) have become a common topic of research interests in the AI community and beyond.
Similarly also the topic concerning the evaluation - in human-like and human-level terms - of the current AI systems has become a critical theme related to the problem Anthropomorphic interpretation of AI output (see e.g. lnkd.in/dVi9Qf_k ).
Book reviews have been published on ACM Computing Reviews (2021) lnkd.in/dWQpJdkV and on Argumenta (2023): lnkd.in/derH3VKN

I have been invited to present the content of the book in over 20 official scientific events in international conferences, Ph.D Schools in US, China, Japan, Finland, Germany, Sweden, France, Brazil, Poland, Austria and, of course, Italy.

A news I am happy to share is that Routledge/Taylor & Francis contacted me few weeks ago for a second edition! Stay tuned!

The #book is available in many webstores:
- Routledge: lnkd.in/dPrC26p
- Taylor & Francis: lnkd.in/dprVF2w
- Amazon: lnkd.in/dC8rEzPi

@academicchatter @cognition
#AI #minimalcognitivegrid #CognitiveAI #cognitivescience #cognitivesystems

Replied in thread

@anthropy I've worked on and with #AI since the 90s. One of the biggest issues I see now is that people think it's ONE THING. I treat AI/ #LLMs as #SpecialNeedsStudents some can see/others can't, some can remember/others forget, some can fold proteins/most can not. Many people use the wrong AI inappropriately then blame it when they don't get a correct answer. It is a tool. People are trying to write a thesis about Chaucer while using a Cook Book to do it. Then wonder why they have problems.

ICYMI: I'll be talking at the Melbourne #ML and #AI Meetup in a couple weeks' time about the #TokenWars - the conflict for data to train LLMs and the fight by IP rights holders to protect their data from scrapers.

Come learn about how #LLMs are trained on huge volumes of tokens with transformers, why those tokens are becoming more economically valuable, and what you can do to protect your token treasure.

You'll never look at ChatGPT or data the same way again.

Huge thanks to @jonoxer for the recommend, and to Lizzie Silver for the behind the scenes wrangling.

meetup.com/machine-learning-ai

MeetupThe Token Wars, Tue, Apr 15, 2025, 6:00 PM | MeetupThe MLAI Meetup is a community for AI researchers and professionals which hosts monthly talks on exciting research. Our format is: * 6:00 - 6:20: Socializing * 6:20 - 6:40

I entered the #ComputationalLinguistics field in 2018 by enrolling for a Bachelor's degree.

Since then, a lot has changed. Almost all the things we learned about, programmed in practice and did research on are now nearly irrelevant in our day-to-day.

Everything is #LLMs now. Every paper, every course, every student project.

And the newly enrolled students changed, too. They're no longer language nerds, they're #AI bros.

I miss #CompLing before ChatGPT.

The contradiction certain people (usually ones with a VC streak) in the AI space sell is that despite AI supposedly signalling an era of abundance, everything needs to be pay-walled and wealth concentrated, instead making everything open source. This is how you uncover the wolves in this space. #ai #llms

Continued thread

"Why do language models sometimes hallucinate—that is, make up information? At a basic level, language model training incentivizes hallucination: models are always supposed to give a guess for the next word. Viewed this way, the major challenge is how to get models to not hallucinate. Models like Claude have relatively successful (though imperfect) anti-hallucination training; they will often refuse to answer a question if they don’t know the answer, rather than speculate. We wanted to understand how this works.

It turns out that, in Claude, refusal to answer is the default behavior: we find a circuit that is "on" by default and that causes the model to state that it has insufficient information to answer any given question. However, when the model is asked about something it knows well—say, the basketball player Michael Jordan—a competing feature representing "known entities" activates and inhibits this default circuit (see also this recent paper for related findings). This allows Claude to answer the question when it knows the answer. In contrast, when asked about an unknown entity ("Michael Batkin"), it declines to answer.

Sometimes, this sort of “misfire” of the “known answer” circuit happens naturally, without us intervening, resulting in a hallucination. In our paper, we show that such misfires can occur when Claude recognizes a name but doesn't know anything else about that person. In cases like this, the “known entity” feature might still activate, and then suppress the default "don't know" feature—in this case incorrectly. Once the model has decided that it needs to answer the question, it proceeds to confabulate: to generate a plausible—but unfortunately untrue—response."

anthropic.com/research/tracing

"Anthropic's research found that artificially increasing the neurons' weights in the "known answer" feature could force Claude to confidently hallucinate information about completely made-up athletes like "Michael Batkin." That kind of result leads the researchers to suggest that "at least some" of Claude's hallucinations are related to a "misfire" of the circuit inhibiting that "can't answer" pathway—that is, situations where the "known entity" feature (or others like it) is activated even when the token isn't actually well-represented in the training data.

Unfortunately, Claude's modeling of what it knows and doesn't know isn't always particularly fine-grained or cut and dried. In another example, researchers note that asking Claude to name a paper written by AI researcher Andrej Karpathy causes the model to confabulate the plausible-sounding but completely made-up paper title "ImageNet Classification with Deep Convolutional Neural Networks." Asking the same question about Anthropic mathematician Josh Batson, on the other hand, causes Claude to respond that it "cannot confidently name a specific paper... without verifying the information.""

arstechnica.com/ai/2025/03/why

Ars Technica · Why do LLMs make stuff up? New research peers under the hood.By Kyle Orland