20  Our Packages

We have a suite of R packages that have been developed internally. They all serve different purposes on a project, but together aim to empower the SAMY Alliance. We don’t license the software to clients. What we sell is the knowledge that they can produce.

20.1 ParseR

ParseR is the collective name for the techniques SAMY uses for text analysis. It’s primarily based on the tidytext philosophy and the analysis is normally carried out in R.

20.2 ConnectR

ConnectR is our package for network analysis. It helps the user find important individuals by graphing retweets and important communities by graphing mentions.

20.3 SegmentR

SegmentR is the collective name for the techniques SAMY uses to find latent groups in data.

20.4 BertopicR

BertopicR is our package which allows access to BERTopic’s modelling suite in R via reticulate.

20.5 LandscapeR

LandscapeR is our package for exploring text data which has been transformed into a navigable landscape. The package makes use of cutting-edge language models and their dense word embeddings, dimensionality reduction techniques, clustering and/or topic modelling as well as Shiny for an interactive data-exploration & cleaning UI.

If the conversation has been mapped appropriately, you will find that mentions close together in the Shiny application/UMAP plot have similar meanings, posts far apart have less similar meanings. This makes it possible to understand and explore thousands, hundreds of thousands, or even millions of posts at a level which was previously impossible.

20.6 LimpiaR

LimpiaR is an R library of functions for cleaning & pre-processing text data. The name comes from ‘limpiar’ the Spanish verb’to clean’. Generally when calling a LimpiaR function, you can think of it as ‘clean…’.

LimpiaR is primarily used for cleaning unstructured text data, such as that which comes from social media or reviews. In its initial release, it is focused around the Spanish language, however, some of its functions are language-ambivalent.

20.7 DisplayR

DisplayR is our package for data visualization, offering a wide array of functions tailored to meet various data visualization needs. This versatile package aims to improve data presentation and communication by providing visually engaging and informative graphics.

20.8 HelpR

HelpR is SAMY’s R package for miscellaneous functions that can come in handy across a variety of workflows.

As you progress through your data science journey, you may take an interest in developing your own package. Depending on your previous experience developing software, this might be daunting, but don’t worry none of us had much experience building packages when we joined; we all learned on the job - and so can you. If you want to.

See the Package Development document for more information.