Skip to content
View Bestroi150's full-sized avatar

Block or report Bestroi150

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Bestroi150/README.md

πŸ›οΈ Hi, I'm Bestroi

Digital Humanities β€’ Digital Classics β€’ NLP β€’ HTR β€’ Network Analysis β€’ ML

🌾 I explore ancient texts, manuscripts, and cultural heritage using modern computational tools.
πŸ” My work combines HTR, NLP, philology, and network science to understand texts, people, and historical systems.

πŸ“š About Me

πŸ’  Researcher in Digital Humanities and Digital Classics
πŸ’  Working on HTR for manuscripts, scribal studies, paleography
πŸ’  Applying NLP to historical languages, ancient corpora & critical editions
πŸ’  Interested in network analysis of texts, people, places & cultural interactions
πŸ’  Experienced with TEI XML, IIIF, semantic web, and cultural data modeling
πŸ’  Passionate about AI for cultural heritage, manuscript digitization & text encoding


⚑ Tech & Research Stack

Core Languages

Python JavaScript PHP HTML Markdown


πŸ“œ Digital Humanities / Digital Classics Tools

TEI XML IIIF Tesseract OCR Kraken Transkribus


🧠 NLP & Machine Learning

PyTorch TensorFlow Transformers SpaCy NLTK FastText


πŸ”— Network Analysis & Graph Science

NetworkX Gephi Cytoscape GraphViz Pandas

πŸ“Œ I build and analyze networks of:

  • ancient authors & texts
  • scribes and manuscript transmission
  • places, events, and cultural exchange
  • semantic, lexical, and concept networks
  • social networks in historical sources

πŸ“Š Data Science

NumPy Pandas Matplotlib Plotly


πŸ§ͺ Research Infrastructure

Docker GitHub Actions AWS Vercel


πŸ§ͺ Research Interests

  • HTR for manuscripts, papyri, inscriptions, codices
  • Computational philology & digital editions
  • Network analysis of texts, authors, manuscripts
  • Cultural heritage informatics & semantic web
  • NLP for ancient / historical languages
  • TEI XML, IIIF, and digital curation workflows
  • Stylometry, authorship, semantic shift
  • Knowledge graphs for classical studies

πŸ“Š GitHub Stats

Streak
Languages


πŸ† GitHub Trophies

Trophies


✨ Quote

β€œThe automation of the linguistic analysis of texts is not intended to replace the humanist, but to free him from the mechanical part of his work.”
– Roberto Busa, S.J.


πŸ›οΈ Mapping the past, modeling the present, and decoding history with code.

Popular repositories Loading

  1. DigitalSEE DigitalSEE Public

    A digital repository of 18th-19th century historical data (engravings, maps, travelogues, diplomatic reports, newspapers, journals, archival materials) structured in bilingual XML files (English an…

    1

  2. DigitalSEE-DataEntrySystem DigitalSEE-DataEntrySystem Public

    DigitalSEE (Digital South-Eastern Europe) Data Entry System is a Flask-based web tool that enables users to create, upload, and search XML metadata files for historical and archaeological sites. De…

    HTML 1

  3. DigitalSEE-HistoricalDataViewer DigitalSEE-HistoricalDataViewer Public

    DigitalSEE-HistoricalDataViewer is a Streamlit-based application that visualizes historical geographic data from the DigitalSEE project. Connected to a Hugging Face Space and using Folium for inter…

    Python 1

  4. digitalsee-tei-collection digitalsee-tei-collection Public

    Collection of TEI-encoded XMLs from the DigitalSEE project, documenting historical monuments and artifacts. The dataset is freely available under the CC BY 4.0 license.

    1

  5. Georgievi-schema Georgievi-schema Public

    This XSD schema defines the structure of XML documents describing correspondence and archival materials related to the Georgievi brothers. It formalizes elements such as document metadata, senders …

  6. NLP_LAT_COLL NLP_LAT_COLL Public

    This dataset contains the complete Satyricon text alongside a curated subset, with full linguistic tokenization and XML markup. The corpus (*sermo vulgaris*) represents Petronius's vernacular Latin…