digitalcourage.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Diese Instanz wird betrieben von Digitalcourage e.V. für die Allgemeinheit. Damit wir das nachhaltig tun können, erheben wir einen jährlichen Vorausbeitrag von 1€/Monat per SEPA-Lastschrifteinzug.

Server stats:

828
active users

#llms

34 posts32 participants0 posts today

NANDA: عندما يتحول الذكاء الاصطناعي إلى شبكة مستقلة تتعاون ذاتيًا

تخيل عالمًا حيث لا تكون الأجهزة مجرد أدوات تتابع أوامرك، بل شركاء حقيقيون قادرون على التعلم، التعاون، واتخاذ القرارات نيابة عنك. هذا هو الطموح الذي يسعى إليه مشروع NANDA من MIT: شبكة لا مركزية من وكلاء الذكاء الاصطناعي، تعمل بلا حاجة إلى خادم مركزي، وكأن الإنترنت بأكمله أصبح […]

wasl.news/nanda-%d8%b9%d9%86%d

Ian McEwan’s National AI Service

A vision of LLMs as encountered in the 2130s after societal recovery from a near-terminal civilisational collapse. From What We Can Know pg 116:

Our students are permitted limited access to NAI. To prevent over-dependence, they must sit before an approved desktop. They also need to wait five days before they get their next shot. The kids mostly want advice on relationships, parents, music, fashion and money. They murmur their confessions and questions and get an immediate response. The Machine, as they like to call it, knows when it is being asked to write a student essay and will terminate the session. In written form, guidance can run to half a dozen single-spaced pages and is, I think, sensible and robust, though I know that others disagree. The tone is comradely. A response to an anxious question from a nineteen-year-old might begin, ‘I believe she’s trying to tell you something here and I’d say it’s time for you to be more reflective and analytical about your own behaviour. Remember the trouble you were in last year.’

NAI knows about a respondent’s life in intimate detail and its memory, of course, is long. The kids like that. They feel important, known and cared for. They are proud of an accumulating dossier that tells of their escapades, successes, disasters and growth. NAI is a friendly aunt, concerned, critical and worldly. The young make confessions to her they would not dare make to close friends or parents. Dossiers can swell by more than 200 pages a year. The kids boast to each other of admonitions as well as praise they’ve received. They enter early adult life as heroes in an epic of trivia and passion. Young newlyweds can destroy a marriage by swapping files, but many insist on it. People continue their consultations through life and seem reassured that neither the state nor commercial entities have access to the material. But confess to a crime and NAI will turn you in.

Most of us in the Humanities Department are wary of taking personal problems to a lifeless piece of software, however sophisticated. Our privileged allotment is every other day. Over in Science and Tech they have unlimited access. The scientists we know are more inclined to take their marriage or career problems to NAI. Along our corridor we tend to approach it as a research tool. I’ve made use of her during my Blundy research and received useful notes on background reading and social contexts. NAI lets me know who’s doing what in my field, who might be trespassing on my territory and who is following an interesting lead.

Rose was forthright. ‘Idiot! Don’t you dare give up. Talk to NAI.’ I resisted, then one afternoon, with little else to do, I sketched out some questions for this beloved program that some in the Philosophy Department believe has attained consciousness. Nonsense, the hard tech people have told me. Pure projection. NAI is no better than the systems of the 2030s. Lack of progress hasn’t been down to know-how. Our various forms of disaster and chaos have blocked the development of better machines and software. No gallium and germanium or even copper in the Surrey Hills! I asked NAI to go back to the two years before and the immediate period after the Second Immortal Dinner and speculate freely for me about the network of private relations around Blundy, and to suggest where I might take my investigations next. Most of what came back was familiar

Gegen ein Mittagsloch können wir euch für morgen einen Espresso-Shot empfehlen.

Am Dienstag, 23. September, 12:00–12:30 Uhr, geht’s im nächsten @CivicDataLab Espresso-Talk um die Frage:
👉 Wie lassen sich Live-Sessions mit Sprachmodellen besser dokumentieren?
Als Ausgangspunkt dient das Experiment vom #CivicDataLab Barcamp in Köln. Dort testete die Community erstmals, wie #LLMs bei Mitschrift, Strukturierung und Nachbereitung unterstützen können.

Bist du dabei?
👉 community.civic-data.de/s/will

community.civic-data.deCivic Data Cafe - community.civic-data.deEspresso Talk : Können wir Live-Sessions mit Sprachmodellen besser dokumentieren? - Hinweis: das Event findet um 12:00 Uhr statt. Wenn eine andere Uhrzeit (z.B. 10-11 Uhr) angezeigt wird, liegt das an...
Replied in thread

@ElenLeFoll Yes, the music was awesome!

It was also nice to hear from Eva Martha Eckkrammer that #DigitalHumanities has a well-etablished and innovative role within the #Romanistik, a strength to maintain into the future.

I agree that the voice of Romance Studies, with their experience of #multilingualism, #diversity, #contextualization and #comparison, is vital for the future development also of #DigitalHumanities methods, including but not limited to #LLMs.

Inside Climate News:

Is AI Throwing Climate Change Under the Bus?

Spoiler alert: Yes, AI is bad for the climate. AI’s computing power relies on massive data centers that use enormous amounts of electricity and water. The Trump administration wants that energy to come from burning fossil fuels, rather than renewable sources. Where does that leave the climate and communities caught in the crosshairs?

insideclimatenews.org/news/220

„Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training“ ist der Titel eines sehr interessanten Papers. In diesem Paper beschreiben die Forschenden, wie LLMs beispielsweise Exploits in Code einbringen können, wenn ein bestimmtes Datum eintritt. Diesen Aspekt sollten wir berücksichtigen, wenn wir unseren Code das nächste Mal durch LLMs fixen lassen.

#ai #llms #llm #cybersecurity #paper

arxiv.org/abs/2401.05566

arXiv logo
arXiv.orgSleeper Agents: Training Deceptive LLMs that Persist Through Safety TrainingHumans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

I'm all against irresponsible uses of LLMs, but users probably wouldn't turn to a natural language prompt to get a properly understandable doc if the documentation effort was decent enough upstream.

Documentation is hard and boring most of the time, but it's what makes your product usable. Bad doc is plain gatekeeping.