Create results

Step 1: Download and install this repository

Create and activate a virtual environment and clone this repository (11.38 MB) by running

python3 -m venv venv && source venv/bin/activate
git clone https://github.com/LoanpyDataHub/GothicHungarian.git

Next, clone the two repositories containing Hungarian and Gothic language data (10.65 MB + 8.75 MB):

git clone https://github.com/LoanpyDataHub/gerstnerhungarian
git clone https://github.com/LoanpyDataHub/koeblergothic

Next, from the same directory, run:

pip install -e GothicHungarian

This will install a command-line interface for running the analysis. It will also install two dependencies, namely loanpy and Spacy, for which we need to install a pretrained German word-vector model. You can find different models on the Spacy website. Currently this 500 MB model seems to be the most suiting (But make sure to use the same model as in gerstnerhungarian and koeblergothic because entries in those repositories were filtered out if they were missing from this particular word-vector model):

python3 -m spacy download de_core_news_lg

To deactivate the virtual environment run:

deactivate

and to remove it run:

rm -r venv

Step 2: Load the relevant data in the right format

From your command-line, run

loadinput

Load and transform input data for loanfinder, save to raw folder.

gothuncommands.loadinput.main()
  1. Read the filenames with the argparse library

  2. Assign the file contents to variables

  3. Create four dictionaries from them for later use

  4. Create input for loanpy.loanfinder.phonetic_matches.

  5. Write files to raw folder.

Step 3: Search for phonetic matches

From your command-line, run

phonmatch

Read the prepared input files in folder raw and search for phonetic matches between Gothic and Hungarian. Write result as phonetic_matches.tsv to folder out.

gothuncommands.phonmatch.main()
  1. Import loanpy.loanfinder.phonetic_matches

  2. Read the input data

  3. Pass it on to loanpy

  4. End the function since loanpy writes the file

Step 4: Search for semantic matches

From your command-line, run

semmatch

Read the phonetic matches file in folder out and search for semantic matches among them. Write results as semantic_matches.tsv to folder out.

gothuncommands.semmatch.main()
  1. Import loanpy.loanfinder.semantic_matches

  2. Read phonetic matches file with csv library

  3. Read related tables that contain the meanings

  4. Grab meanings from related tables and create new input table

  5. Input the table to loanpy.loanfinder.semantic_matches <https://loanpy.readthedocs.io/en/latest/documentation.html#loanpy.loanfinder.semantic_matches>`_

  6. End the function since loanpy writes the file

gothuncommands.semmatch.semsim(meaning1, meaning2)
  1. Convert each meaning to a Spacy-object

  2. Create cartesian product of both meaning lists with a nested for-loop

  3. Return the similarity of the most similar pair

Step 5: Load columns for manual inspection

From your command-line, run

loadcols

Merge IDs in out/semantic_matches.tsv with relevant columns for manual inspection.

gothuncommands.loadcols.main()
  1. Read the semantic matches file

  2. Read the related tables

  3. Stitch the desired columns together

  4. Overwrite the input file

Step 6: Manually inspect the results

Open the file in a spread-sheet software, sort the rows according to semantic similarity (column semsim) and within that according to cognate ID (column ID_s). Carefully look at the matches: Pick candidate loanwords where the phonetic matching and the semantic shift looks plausible.