The Soda Pop
HomeBlogAbout Me

Textual 6 6 0 1



For more on reading and mental health, head to BBC Culture’s Textual Healing series. Lisa Appignanesi’s book Everyday Madness is out now. Elizabeth Day: 'I felt so understood in the pages'. Textual has taken the best of IRC and built it into a single client. Its easy-to-use functionality combined with scripting support makes it an ideal IRC client for novice to advanced users. Version 6.0.1: Changes. The standalone version of Textual now asks whether it should automatically check for updates. Textual variants in the First Epistle to the Corinthians are the subject of the study called textual criticism of the New Testament.Textual variants in manuscripts arise when a copyist makes deliberate or inadvertent alterations to a text that is being reproduced. An abbreviated list of textual variants in this particular book is given in this article below. 0:0:0:0:0:0:0:0 An alternative form that is sometimes more convenient when dealing with a mixed environment of IPv4 and IPv6 nodes is x:x:x:x:x:x:d.d.d.d, where the x's are the hexadecimal values of the six high-order 16-bit pieces of the address, and the d's are the decimal values of the four low-order 8-bit pieces of the address (standard.

The package website can be found at : https://juba.github.io/rainette/.

Rainette is an R package which implements a variant of the Reinert textual clustering method. This method is available in other software such as Iramuteq (free software) or Alceste (commercial, closed source).

Features

  • Simple or double clustering algorithms
  • Plot functions and shiny gadgets to visualise and explore clustering results
  • Utility functions to split a corpus into segments or import a corpus in Iramuteq format

Installation and usage

Textual

The package is installable from CRAN :

The development version is installable from Github :

Let’s start with an example corpus provided by the excellent quanteda package :

First, we’ll use split_segments to split each text in the corpus into segments of about 40 words (punctuation is taken into account) :

Next, we’ll compute a document-term matrix and apply some treatments with quanteda functions :

We can then apply a simple clustering on this dtm with the rainette function. We specify the number of clusters (k), the minimum size for a cluster to be splitted at next step (min_split_members) and the minimum number of forms in each segment (min_uc_size) :

We can use the rainette_explor shiny interface to visualise and explore the different clusterings at each k :

We can then use the generated R code to reproduce the displayed clustering visualisation plot :

Xscope 4 3 1 – onscreen graphic measurement tools download. Or cut the tree at chosen k and add a group membership variable to our corpus metadata :

In addition to this, you can also perform a double clustering, ie two simple clusterings produced with different min_uc_size which are then “crossed” to generate more solid clusters. To do this, use rainette2 either on two rainette results :

Or directly on a dtm with uc_size1 and uc_size2 arguments :

Textual 6 6 0 17

You can then use rainette2_explor, rainette2_plot and cutree_rainette2 to explore and visualise the results.

Tell me more

Textual 6 6 0 12

Three vignettes are available, an introduction in english :

And an introduction and an algorithm description, in french :

Credits

This classification method has been created by Max Reinert, and is described in several articles. Here are two references :

  • Reinert M, Une méthode de classification descendante hiérarchique : application à l’analyse lexicale par contexte, Cahiers de l’analyse des données, Volume 8, Numéro 2, 1983. http://www.numdam.org/item/?id=CAD_1983__8_2_187_0
  • Reinert M., Alceste une méthodologie d’analyse des données textuelles et une application: Aurelia De Gerard De Nerval, Bulletin de Méthodologie Sociologique, Volume 26, Numéro 1, 1990. https://doi.org/10.1177/075910639002600103

Thanks to Pierre Ratineau, the author of Iramuteq, for providing it as free software and open source. Even if the R code has been almost entirely rewritten, it has been a precious resource to understand the algorithms.

Many thanks to Sébastien Rochette for the creation of the hex logo.

Many thanks to Florian Privé for his work on rewriting and optimizing Rcpp code.





Textual 6 6 0 1
Back to posts
This post has no comments - be the first one!

UNDER MAINTENANCE