digitalcourage.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Diese Instanz wird betrieben von Digitalcourage e.V. für die Allgemeinheit. Damit wir das nachhaltig tun können, erheben wir einen jährlichen Vorausbeitrag von 1€/Monat per SEPA-Lastschrifteinzug.

Server stats:

853
active users

#text

7 posts6 participants1 post today
Continued thread

Each chapter has an introduction to a particular topic, Here are some quotes I liked from these introductions.

"Text-processing applications form a substantial part of the application space for any scripting language, if only because everyone can agree that text processing is useful. Everyone has bits of text that need to be reformatted or transformed in various way."
– Fred L. Drake Jr, from his introduction to chapter 1, "Text", in the 2nd edition of the Python Cookbook (2005).

(3/6)

Recently I've combined various functions which I've been using in other projects (e.g. my personal PKM toolchain) and published them as new library thi.ng/text-analysis for better re-use:

- customizable, composable & extensible tokenization (transducer based)
- ngram generation
- Porter-stemming & stopword removal
- vocabulary (bi-directional index) creation
- dense & sparse multi-hot vector encoding/decoding
- histograms (incl. sorted versions)
- tf-idf (term frequency & inverse document frequency), multiple strategies
- k-means clustering (with k-means++ initialization & customizable distance metrics)
- similarity/distance functions (dense & sparse versions)
- central terms extraction

The attached code example (also in the project readme) uses this package to creeate a clustering of all ~210 #ThingUmbrella packages, based on their assigned tags/keywords...

The library is not intended to be a full-blown NLP solution, but I keep on finding myself running into these functions/concepts quite often, and maybe you'll find them useful too...

Replied in thread

Danke, @annette 🙏 - auch mit & Dank dem #Dialog mit @ElisabethK begann ich, Blog - #Text & kurze #Erklärvideos digital zu mischen. So möchte ich - selbst eher #Bücherwurm - verschiedene Kommunikationswege „crossmedial“ anbieten & Engagierten auch das #Kommentieren & #Weiterleiten erleichtern. Und geschrieben habe ich in den letzten Jahren ja Abertausende Seiten, da gibt es noch jede Menge #Video - Material… 🤭🔋🌈🌞 Hier 1 neuer #Blogpost, diesmal zum Gaskonzern #Uniper : scilogs.spektrum.de/natur-des-

Natur des Glaubens · Fossilismus oder Frieden: Die Debatten um Wehrdienst und GaskraftwerkeDr. Michael Blume plädiert auch vor dem Hintergrund von Sicherheitspolitik für den schnelleren Ausbau von Solarenergie & grünem Wasserstoff.

He reunido las herramientas que he hecho en un solo lanzador de mis "Pachi-apps" jeje, el resultado es un "MicroOS" (Que de OS no tiene nada jajaja pero me gusta cómo suena) tiene las herramienta que he hecho y que suelo utilizar. Pronto disponible en mi GitHub y Codeberg (Que ya estoy migrando a este último)

Just ran into another case of this...

Reputable online seller of <things> in Canada, medium-sized, one of maybe half a dozen options at the national level.

Go to buy <object> for several hundred dollars. Fine.

Whether I try to create an account, or checkout as guest, it wants my phone number. Fine, credit card companies want that as an additional line of defence against fraud.

However... before you can continue to the next step, you have to "click to receive a verification code" that you need to enter to complete the form.

And it must be trying to send that code by SMS/text, because I'm not getting it. My phone is a landline; there's no way to text it. They don't try email or sending the code by robot-voice to the phone number, either.

How Do I Capitalism?