Working Paper
Refine
Document Type
- Working Paper (3) (remove)
Keywords
- Korpus <Linguistik> (3) (remove)
Has Fulltext
- yes (3)
Berlin Text System 3.1 User Manual : Editorial Software of the Thesaurus Linguae Aegyptiae Project
(2018)
The Berlin Text System (BTS) Version 3.1 manual introduces a Java-based software designed for editing and annotating Ancient Egyptian texts. BTS integrates a CouchDB database and an Elastic search engine to support its main components: Text Editor, Lemma List, Thesaurus, and Abstract Text.
The Text Editor facilitates transliteration, translation, lemmatization, and annotations, allowing for detailed lexical and grammatical analysis. Hieroglyphic transcriptions can be entered via a specialized Hieroglyph Type Writer based on JSesh.
The Lemma List ist ready to contain pre-Coptic lemmata, divided into Hieroglyphic/Hieratic and Demotic scripts, providing comprehensive entries with passport data, transliterations, and translations.
The Thesaurus allows for metadata enrichment of texts with controlled vocabulary for consistent data management, supporting contextual analysis through structured metadata.
The manual covers BTS's user interface, including menu bar, toolbar, status bar, and workspace, divided into views for each main component. Features like Revision History for tracking and restoring versions, indexing, and search capabilities enhance user efficiency. BTS is a powerful tool for the study and preservation of Ancient Egyptian texts, integrating advanced database and search technologies with specialized textual analysis tools.
Numerous high-quality primary text sources—in the context of the curation project described here, this means full-text transcriptions (and corresponding image scans) of German works originating from the 15th to the 19th centuries—are scattered among the web or stored remotely. E.g., transcriptions of historical sources are stored locally on degrading recording media and cannot be found, let alone accessed by third parties. Additionally, idiosyncratic, project-specific markup conventions and uncommon, out-of-date or inflexible storage formats often hinder further usage and analysis of the data. Often, textual resources are accompanied by scarce, insufficient or inaccurate bibliographic information, which is only one further reason why valuable resources, even if available on the web, remain undiscovered by and are of little use to the wider research community. The integration of these dispersed primary text sources into the sustainable, web and centres-based research infrastructure of CLARIN-D will be an important step to solve this problem. The Full Paper illustrates an exemplary approach taken by the »Deutsches Textarchiv« (DTA; www.deutschestextarchiv.de) at the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW) to integrate dispersed textual resources and corresponding image scans from various sources into a large historical text corpus of its own and to insert these into the infrastructure of CLARIN-D.