tutorial.walkthrough

An end-to-end walkthrough on setting up a trouver workflow

In this tutorial, we describe a recommended setup for using trouver. Note that the exact details and consequences of this walkthrough may change as trouver or any related software changes; the current version of this walkthrough was written on 4/9/2023 and was written based on trouver version 0.0.4.

See also the Google Colab notebook of this walkthrough (the Google Colab notebook is currently imageless).

Installations

See also how_to.install_trouver.

Go to a command-line (e.g. cmd on Windows, Terminal on Linux) and install trouver with pip:

pip install trouver

Install JupyterLab.

pip install jupyterlab

Once installed, launch JupyterLab with

jupyter-lab

This is supposed to be an image of JupyterLab being launched

Alternatively, install the classic Jupyter Notebook via

pip install notebook

and launch with

jupyter notebook

This is supposed to be an image of classic Jupyter notebook being launched

or use Visual Studio Code to view and edit notebooks.

This is an image of Visual Studio Code

Install Obsidian.md. Obsidian.md is a note taking software that, among other functionalities, makes it easy to link between notes.

Note that the most important basic functionalities in Obsidian.md are free.

This is an image of the Obsidian.md website

The following window pops up upon launching Obsidian.md

This is an image of Obsidian.md upon launch

Warning At the time of this writing, trouver has not been tested on MacOS extensively. We have also found that running the latest Python versions in Jupyter notebooks on modern MacOS machines (e.g. those using the M1 processor and in particular the arm64 architecture) lead to some issues. cf. stackexchange discussions such as

- https://apple.stackexchange.com/questions/436801/m1-mac-mach-o-file-but-is-an-incompatible-architecture-have-x86-64-need-a
- https://stackoverflow.com/questions/71502583/jupiter-wont-launch-from-terminal-on-m1-macbook.

For MacOS users, it may be useful to go through this tutorial/walkthrough using a Python console or using the Google Colab Notebook for this tutorial/walkthrough.

Creating an Obsidian.md vault

An Obsidian vault is a folder where Obsidian stores your notes as well as setting files, CSS, trash folder, and any sub-folders, notes, and attachments you add yourself, cf. “What exactly is a vault?”.

Let us make a new Obsidian.md vault. For this example, we call the vault example_math_vault and create it in a local folder called trouver_walkthrough_folder. Doing so creates a folder example_math_vault in trouver_walkthrough_folder, i.e. creates the folder trouver_walkthrough_folder/example_math_vault:

This is an image of creating a new `Obsidian.md` vault

Upon creating the vault, Obsidian.md will open the vault:

This is an image of the newly created `Obsidian.md` vault

Obsidian.md functionalities

Creating files and folders in an Obsidian vault

By and large, one can make files and folders fairly liberally in one’s own Obsidian vault (note, however, that Obsidian can experience significant lags on vaults with many notes, say on the order of 10,000, depending on the machine running it).

For example, let us create a new note by clicking on the “New note” icon in the left pane:

Click on the new note button

Let us name the new note _index:

Rename the new note from `Untitled` to `_index`

Let us check that a new file named _index.md has been created in the computer. To open the directory that the file is in, it is convenient to use the Show in system explorer command. To use this command, open the Command Palette by either clicking on the Open command palette icon in the left pane

Rename the new note from `Untitled` to `_index`

or by using the Hotkey/keyboard shortcut for the Open command palette command. By default, this shortcut is either Ctrl + p or Cmd + p (or the appropriate variant for your computer).

Either way, the command palette should open up.

The command palette

Type the word system so that the command palette shows that Show in system explorer command.

The show in system explorer command

Select this command to open the directory containing the _index.md file, which should be the root directory of the vault:

See the `_index.md` file created in the root directory of the vault

Quick Switcher

In Obsidian, one can search and open notes by name. Open the Quick switcher either with the Open quick switcher icon in the left pane or with the Hotkey Ctrl + o or Cmd + o:

Open quick switcher icon

The quick switcher

Start typing _index_number_theory. Note that the quick switcher autocompletes note names. Open the _index_number_theory note by selecting it in the quick switcher:

Type in the name of a note to get the quick switcher to open it

Type in the name of a note to get the quick switcher to open it

One can add “aliases” to Obsidian notes to make them better searchable.

More Obsidian features

Obsidian has quite a lot of features that make Obsidian vaults highly useful and customizable. In fact, there are quite a lot of features of Obsidian that the author of trouver regularly uses:

  • Vim keybindings
  • CSS Styles
  • Plugins
  • etc.

See the Obsidian Help vault online for more information on Obsidian’s features.

Adding some files and folders for formatting for trouver

For the purposes of using trouver, we will need a few more folders. We will need a _references folder and a _templates folder and multiple subfolders. For convenience, copy the _references and _templates folders from the nbs/_tests/empty_model_vault directory of the trouver GitHub repository:

Type in the name of a note to get the quick switcher to open it

Each of these folders have subfolders A-E, F-J, K-O, P-T, and U-Z and subsubfolders corresponding to each letter in the English alphabet. These are for organization. There are also multiple files named README.md in this subsubfolders. Feel free to delete them.

Rationale for the _references and _templates folders

The basic file organization philosophy for trouver is that each mathematical reference should have a dedicated folder in a vault and mathematical (standard information) notes derived from the mathematical reference should belong to this folder (or one of its subfolders/subsubfolders/etc).

The _references folder contains notes that contain information about each mathematical references, e.g. bibliographical information, where to find the reference (say on the arxiv), any other personal notes about the reference. The reference notes will be embedded in the standard information notes so that these information do not have to be manually replicated from standard information note to standard information note.

The _templates folder contains notes which serve as templates for the standard information notes for each reference.

Dividing a LaTeX file

Obtaining an arxiv file

One fortunate state of mathematics is that many manuscripts are made publicly available on arxiv.org before they are formally published. In fact, the source code to many of these manuscripts are also generally available on the arxiv.

trouver can take the source code of a LaTeX file and divide it into not-too-long parts in an Obsidian vault.

As an example, let us go to the arxiv page for a paper written by the creator of trouver with a collaborator (See the Special Thanks section of index for an acknowledgement to Sun Woo Park, the collaborator in question, for permitting this paper to be presented for examples in trouver’s documentation).

The webpage for an arxiv article

On the right, beneath Download:, note that there is a link titled Other formats. Access the link to got to the Format selector page:

The format selector page for the arxiv article

Around the bottom of this page is the option to download the source file(s) for the article. Click the Download source link to download the source file(s). Move the files to a different location if desired. For this example, we made a folder named latex_files in the trouver_walkthrough_folder, a subfolder named kim_park_ga1dcmmc in latex_files, and moved the source file into kim_park_ga1dcmmc.

New folder with arxiv source file

Either the file is compressed and contains multiple files, or the file is not compressed and is actually a .tex file. Attempt to decompress the file. If this fails, rename the file to have the .tex extension. Either way, you should have a .tex file at this stage. For this example, it turns out that the file is compressed and multiple files appear upon decompression.

Attempt to decompress the file

Files extracted

The author usually finds it convenient to name the (main) .tex file main.tex for the purposes of using trouver, but the name of the .tex file ultimately does not matter.

Using trouver to divide the file and make notes in the Obsidian vault

Now that we have the .tex file set up, we are now ready to the code from trouver to divide it. While the code can really be run from any python interpreter, we highly recommend using a notebook (say via Jupyter or VSCode) to run trouver code.

To open JupyterLab, open a command-line interface and run

jupyter-lab

JupyterLab opened once more.

Create a new notebook and run the following Python import statements:

import os
from pathlib import Path
import shutil
import tempfile
from trouver.helper import _test_directory, text_from_file
from trouver.latex.convert import (
    divide_preamble, divide_latex_text, custom_commands,
    setup_reference_from_latex_parts
)

Further run (a variant of) the following code. Replace latex_file and vault to be Python Path’s appropriate for your example. For more general instances, you may want to change the reference variable, the location variable, and the author_names varaibles as well.

# One way to get folders accessible in Google Colab is to upload
# zip files of them and to unzip them.
# !unzip /content/example_math_vault.zip
# !unzip /content/latex_files.zip
# If using Google colab, make sure to upload the latex file and the vault folder so that Google Colab can
# access them.
reference = 'kim_park_ga1dcmmc'  # The name of the reference as well as the name of the folder that contains the latex file
latex_file = Path(r'C:\Users\hyunj\Documents\Development\Python\trouver_walkthrough_folder\latex_files') / reference / 'main.tex'
latex_text = text_from_file(latex_file)
preamble, _ = divide_preamble(latex_text)
parts = divide_latex_text(latex_text)
cust_comms = custom_commands(preamble)
vault = Path(r'C:\Users\hyunj\Documents\Development\Python\trouver_walkthrough_folder\example_math_vault')  # The path to the Obsidian vault
location = Path('number_theory')  # The path relative to the vault of the directory in which to make the new folder containing the new notes.
author_names = ['Kim', 'Park']  # The names of the authors; this is for the template file for the reference folder.

setup_reference_from_latex_parts(
    parts, cust_comms, vault, location,
    reference,
    author_names,
    overwrite='w',
    confirm_overwrite=True)

The code has been executed in JupyterLab and the code has set up a folder in the Obsidian vault

Safely ignore the warning messages about configuration files for plugins.

Now return to the Obsidian vault. Note that a new folder named kim_park_ga1dcmmc has been created in the number_theory folder of the vault. In Obsidian, open the _index_kim_park_ga1dcmmc note:

The index file for the reference folder is opened in Obsidian.

Click on any of the links in the _index_kim_park_ga1dcmmc note to open another “index note”:

The index file a section of the reference is opened in Obsidian.

Click on any of the links in the index note to open a “standard information note”. This note contains some text from the paper on arxiv.

A standard information note for the reference

You may need to reload the Obsidian vault before you can view some of these notes; Obsidian may not have registered/“indexed” some of these notes upon their creation. Reload Obsidian using the Reload app without saving commmand in the command palette:

Reload Obsidian vault

At this point, we recommend that you explore the newly created notes and folders. As an overview, the above code does the following:

  1. Creates a folder for the reference in the specified folder (In this case, a folder named kim_park_ga1dcmmc was created in the number_theory folder in the vault).
  2. Creates folders corresponding to sections of the LaTeX file.
  3. Creates various notes/files in the reference folder, including, but not limited to
    • standard information notes
    • an index note (in this case named _index_kim_park_ga1dcmmc)
  4. A template note for the standard information notes for the reference folder in the _templates folder of the vault. (In this case, the template note is created in _templates/K-O/K folder as the file _template_kim_park_ga1dcmmc.md)
  5. A copy of the above template note located in the root of the reference folder (In this case, the template note is created as the file _template_kim_park_ga1dcmmc_2.md).
    • This copy template note is created for use in case the user wishes to edit the reference folder by opening it as an Obsidian vault; in general, subdirectories of an obsidian vault can be opened as obsidian vaults in themselves. This can be advantageous when the “main” vault has many notes and thus Obsidian runs slowly in the “main” vault.
  6. A reference note for the reference in the _references folder of the vault (In this case, the reference note is located in the _references/K-O/K folder as the file _reference_kim_park_ga1dcmmc.md)
    • The user is encouraged to fill in this note with bibliographic information of the reference. Note that this reference note is embedded in the standard information notes for the reference, so the information provided in the reference note is easily readable from the standard information notes (in viewing mode).

If you made a mistake in running the above code and would like to re-setup your reference folder, you can use the following code to delete the reference folder, the template note, and the reference note. Enter the letter Y (case sensitive) to confirm that you wish to delete the folder and these notes. Note that this operation cannot be reversed.

from trouver.markdown.obsidian.personal.reference import (
    delete_reference_folder
)
# TODO: delete template and reference files even without reference folder.
delete_reference_folder(vault, reference)

Delete reference folder in Jupyter

Using these notes

One can fairly liberally modify the contents in a standard information note for the purposes of trouver. Some things that we recommend include, but are not limited to:

  • For the user to read more about Obsidian’s features in Obsidian’s official help vault, including Obsidians’ flavor of Markdown syntax
    • However, there are many forms of Obsidian’s Markdown syntax that trouver does not yet parse. One example is that trouver does not currently parse inline footnotes.
  • For the user to see what (community) plugins are available for Obsidian.
    • Use these plugins at your own risk as there is no way to ultimately ensure that malicious or insecure code is not included in such plugins. However, most to all of these plugins are made open source, so the user can themselves understand how these plugins are implemented.
  • For the user to read trouver’s documentation and to explore the example vaults in the nbs/_tests directory of trouver(’s GitHub repository)
  • For the user to liberally mark their standard information notes with footnotes with personal memos about their experience reading the specific excerpt(s) from the specific mathematical references.

An example of a note with footnotes memoing the user's thought process in reading a paper.

  • For the user to make modifications in the standard information notes to clean up the LaTeX syntax for the purposes of reading in Obsidian.md
    • For example, $ \operatorname{Gal}(L/K)$ (notice the space between the $ and the \ characters) will not render properly due to the space. One can correct this by typing $\operatorname{Gal}(L/K)$ instead.
  • For the user to make links between notes to develop a stronger understanding of the relationship among notes and to build reliable methods for quickly recalling mathematical facts, definitions, or notations.
  • For the user to NOT create and maintain notes of the same file name in a single vault. trouver generally operates under the assumption that an Obsidian vault will not have two or more files of the same name. In particular, errors or unexpected consequences may arise if the code of trouver is run on an Obsidian vault with two files of the same name.

However, here are some modifications that one should not make to a standard information note for the purposes of trouver:

  • Changing the title of/deleting the See Also, Meta, References, Citations and Footnotes headers/sections; doing so will result in errors in some functionalities of trouver as trouver’s criteria for recognizing a standard information note is as a note with such headers.
    • One can nevertheless modify the contents within these sections. Note that the title of the section titled Topic may be modified liberally and that sections can be added liberally.
  • Changing the formatting of the frontmatter YAML metadata (The text in the beginning of a note starting with ---).
    • One can and should nevertheless modify the contents of the frontmatter YAML metadata as appropriate.

Using Machine learning models

Now that we have set up a reference folder from a LaTeX file, we should make great use of the notes/files in the folder to learn concepts presented in the LaTeX file.

One of the basic challenges in deciphering a mathematical paper is grasping the definitions and notations presented in the paper. The machine learning models can help make this process more reliable and faster.

Currently, the functionalities of trouver are more focused towards finding notations as opposed to definitions. In time, we hope to improve these functionalities to better encompass finding definitions as well. Nevertheless, the functionalities of trouver effectively allow the user to identify where notations are introduced in the various notes in the reference folder and to create, for each notation, a note dedicated to the notation.

Currently, trouver uses three ML models:

  1. hyunjongkimmath/information_note_type
  2. hyunjongkimmath/notation_identification
  3. hyunjongkimmath/notation_summarizations_model

The former two models are trained using fast.ai and the latter two models are trained using the Hugging Face Transformers library.

ML models can be computationally intensive to run. As such, trouver roughly operates on a “run-once, record results for later use” principle when it comes to its ML models. Moreover, since ML models inherently cannot be perfect, trouver also operates on general principle to allow for users to manually correct these recorded results without raising errors.

Remark Graphical Processing Units (GPU’s) are not necessary to use these models. In particular, these models can be loaded onto a computer’s CPU.

Downloading and loading the ML models

Since ML models typically are large in file size, the models are not part of the trouver library itself. Instead, the models are made publicly available on Hugging Face, where they can be downloaded.

Run the following code to download the models from the Hugging Face Hub and then to load the models. Depending on your internet connection, it may take a few minutes to download this models the first time because each model is at least several hundred megabytes large. Moreover, these models are saved to a local cache folder. See Hugging Face cache management for more details.

import pathlib
from pathlib import Path, WindowsPath
import platform

from huggingface_hub import from_pretrained_fastai

from trouver.markdown.obsidian.personal.machine_learning.information_note_types import automatically_add_note_type_tags

# Load the model that categorizes the type(s) of standard information notes
repo_id = 'hyunjongkimmath/information_note_type'
if platform.system() == 'Windows':
    temp = pathlib.PosixPath # See https://stackoverflow.com/questions/57286486/i-cant-load-my-model-because-i-cant-put-a-posixpath
    pathlib.PosixPath = pathlib.WindowsPath
    information_note_type_model = from_pretrained_fastai(repo_id)
    pathlib.PosixPath = temp
else:
    information_note_type_model = from_pretrained_fastai(repo_id)

# Load the model the finds notations introduced in standard information notes
from trouver.markdown.obsidian.personal.machine_learning.notation_identification import automatically_mark_notations
repo_id = 'hyunjongkimmath/notation_identification'
if platform.system() == 'Windows':
    temp = pathlib.PosixPath # See https://stackoverflow.com/questions/57286486/i-cant-load-my-model-because-i-cant-put-a-posixpath
    pathlib.PosixPath = pathlib.WindowsPath
    notation_identification_model = from_pretrained_fastai(repo_id)
    pathlib.PosixPath = temp
else:
    notation_identification_model = from_pretrained_fastai(repo_id)

# Load the model the summarizes what notations denote
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline
from trouver.markdown.obsidian.personal.machine_learning.notation_summarization import append_summary_to_notation_note
model = AutoModelForSeq2SeqLM.from_pretrained('hyunjongkimmath/notation_summarizations_model')
tokenizer = AutoTokenizer.from_pretrained('hyunjongkimmath/notation_summarizations_model')
summarizer = pipeline('summarization', model=model, tokenizer=tokenizer)

The models are now loaded in a Jupyter notebook.

Note that there are some if-else statements used to load the fast.ai models; this is a workaround implemented for Windows users as loading fast.ai models seems to have some issues with Python’s pathlib, cf. https://stackoverflow.com/questions/57286486/i-cant-load-my-model-because-i-cant-put-a-posixpath.

After running this code, we have the Python variables information_note_type_model, notation_identification_model, and summarizer. Note that summarizer is technically not a model, but rather a pipeline that contains a model.

To use the models, first run the following import statements:

import os
from pathlib import Path
import shutil
import tempfile

from trouver.helper import _test_directory, text_from_file
from trouver.latex.convert import (
    divide_preamble, divide_latex_text, custom_commands,
    setup_reference_from_latex_parts
)
from trouver.markdown.markdown.file import MarkdownFile
from trouver.markdown.obsidian.vault import VaultNote
from trouver.markdown.obsidian.personal.notes import notes_linked_in_notes_linked_in_note, notes_linked_in_note
from trouver.markdown.obsidian.personal.notation import make_notation_notes_from_double_asts, notation_notes_linked_in_see_also_section

Categorizing the type(s) of standard information notes

Run the following code to 1. use the information_note_type_model to predict the types of each standard information note in the newly created reference folder, and 2. record these predictions to the standard information notes’ respective frontmatter YAML metadata. Running the following code may take several minutes.

# Change `vault` and `reference` if necessary.
# vault = Path(r'C:\Users\hyunj\Documents\Development\Python\trouver_walkthrough_folder\example_math_vault')  # The path to the Obsidian vault
# `reference` = 'kim_park_ga1dcmmc`
index_note = VaultNote(vault, name=f'_index_{reference}')
notes = notes_linked_in_notes_linked_in_note(index_note, as_dict=False)

for note in notes:
    if not note.exists():
        raise Exception(note.name)

print("Tagging notes")
automatically_add_note_type_tags(information_note_type_model, vault, notes)

Let us first describe in more detail what the above code does. The first line

index_note = VaultNote(vault, name=f'_index_{reference}')

creates a VaultNote object from the trouver.markdown.obsidian.vault module, which represents a note in an Obsidian vault. The note/file that a VaultNote object represents does not have to exist, although runtime errors may be raised on certain operations, such as reading or writing a file, that require an existing file. In the above line of code, we pass vault as an argument, signifying that the VaultNote object should represent a note from the specified Obsidian vault (as opposed to a file in any directory outside of vault). We also pass the Python string f'_index_{reference}' as the argument to the name parameter — in this case the string equals the string '_index_kim_park_ga1dcmmc' — signifying that the VaultNote object should represent the note named _index_kim_park_ga1dcmmc (recall that trouver operates under the assumption that no two notes in an Obsidian vault have the same name).

Furthermore, the VaultNote class has a cache which is supposed to keep track of all the .md files in an Obsidian vault. Creating or deleting files by not using methods from the VaultNote class may make the information stored in this cache obsolete. The VaultNote class will try to re-scan files in the vault to update its cache, but sometimes (for reasons that are not yet well documented) the user may need to manually clear the cache via the VaultNote.clear_cache class method:

VaultNote.clear_cache()

Depending on the size of the vault and the specs of the computer running trouver, scanning the files in a vault may take several seconds or more.

The second line of code creates a list of VaultNote objects and stores the list in the variable notes:

notes = notes_linked_in_notes_linked_in_note(index_note, as_dict=False)

The notes_linked_in_notes_linked_in_note function is implemented specifically for index notes which keep track of index notes, which in turn keep track of standard information notes, via links; recall that the note _index_kim_park_ga1dcmmc is such an index note:

The note `_index_kim_park_ga1dcmmc` is an index note that itself keeps track of index notes

The code

for note in notes:
    if not note.exists():
        raise Exception(note.name)

checks that the standard information notes listed in notes actually exist (more accurately, the code checks that the links in the notes linked in the note _index_kim_park_ga1dcmmc point to notes that actually exist). Finally,

automatically_add_note_type_tags(information_note_type_model, vault, notes)

uses the model to make predictions and record those predictions to the frontmatter YAML metadata of the notes.

The following is a comparison of a note before and after these predictions are made and recorded (note that the note is viewed in Edit mode as opposed to Viewing mode):

A note before the predictions are made

A note before the predictions are made

Before the predictions, the note only has the _reference/kim_park_ga1dcmmc and _meta/literature_note tags in the tags section of its YAML frontmatter metadata. After the predictions, the _auto/_meta/narrative tag is added.

The _auto prefix is used to signify that the tag is added automatically with the model. The _meta/narrative tag signifies that the content of the note contains a “narrative”; in this example, the note gives an overview of \(\mathbb{A}^1\)-enumerative geometry, so it is fitting to consider it is “narrative”. Some tags that the model will predict for include

  • _meta/definition
  • _meta/notation
  • _meta/concept
  • _meta/proof
  • _meta/narrative
  • _meta/exercise
  • _meta/remark
  • _meta/example

See the documentation for trouver.markdown.obsidian.personal.machine_learning.information_note_types for more details.

We recommend manually editing these tags in your standard information notes as follows (if you have the time and inclination):

  • remove the _auto prefix to confirm the model’s prediction if the model makes a correct prediction.
  • delete the tag altogether if the model’s prediction is incorrect.
  • add tags that the model incorrectly did not predict.

At the very least, we recommend making these edits for the _meta/definition and _meta/notation tags, which signify that a standard information note introduces at least one definition and at least one notation respectively.

For example, upon recognizing that the model correctly predicted the standard information note to contain a narrative, we delete the _auto prefix from the _auto/_meta/narrative tag:

A note before the predictions are made

Locating notations introduced in standard information notes

Next, we use the notation_identification_model to locate notations introduced in the standard information notes and record these locations in the notes. Run the following code, which may take several to a few dozen minutes:

index_note = VaultNote(vault, name=f'_index_{reference}')
notes = notes_linked_in_notes_linked_in_note(index_note, as_dict=False)

for note in notes:
    assert note.exists()

print("Finding notations")
note_mfs = [MarkdownFile.from_vault_note(note) for note in notes]
notation_notes = [note for note, mf in zip(notes, note_mfs) if mf.has_tag('_auto/_meta/definition') or mf.has_tag('_auto/_meta/notation') or mf.has_tag('_meta/definition') or mf.has_tag('_meta/notation')]
for note in notation_notes:
    automatically_mark_notations(note, notation_identification_model, reference_name=reference)

Again, notes is a list of all standard information notes in the reference folder. The line

note_mfs = [MarkdownFile.from_vault_note(note) for note in notes]

Creates a list of MarkdownFile objects of the trouver.markdown.markdown.file module. A MarkdownFile object represents the contents of an Obsidian Markdown file (as opposed to a file itself). In particular, a MarkdownFile object can be created from a VaultNote object (via the MarkdownFile.from_vault_note factory method) or from a Python string (via the MarkdownFile.from_string factory method). In particular, note_mfs is a list of MarkdownFile objects created from the contents of notes.

The line

notation_notes = [note for note, mf in zip(notes, note_mfs) if mf.has_tag('_auto/_meta/definition') or mf.has_tag('_auto/_meta/notation') or mf.has_tag('_meta/definition') or mf.has_tag('_meta/notation')]

then creates notation_notes as a sublist of notes consisting only of the notes with at least one of the following tags in their frontmatter YAML metadata:

  • _auto/_meta/definition
  • _auto/_meta/notation
  • _meta/definition
  • _meta/notation

The code

for note in notation_notes:
    automatically_mark_notations(note, notation_identification_model, reference_name=reference)

Then iterates through notation_notes, using notation_identification_model to locate notations introduced only in the notes in notation_notes. We recommend using the above code to use the model to locate notations in the notes in notation_notes because the current implementations of trouver and notation_identification_model locate the notations rather slowly, and hence the above code restricts searching for notations only in the notes that are deemed likely to contain definitions or notations.

The following is an example of a comparison of a note before and after notations are located (for the current version of trouver, the note named kim_park_ga1dcmmc_29 should suffice as an example):

A note after notations are identified

A note after notations are identified

Note that the difference is that double asterisks ** are surrounded in LaTeX strings that the model deems to introduce a notation. To make these differences more conspicuous, we recommend the user to use the Source mode as Obsidian’s Default editing mode in the Settings. For this example, the model has determined $F_q(X,Y)$ and $e(\pi, q)$ to introduce notations in the note.

We highly recommend the user to manually make the following edits to correct notation_identification_model’s mistakes:

  1. Remove the double asterisks ** surrounding LaTeX strings that do not actually introduce a notation
  2. Add double asterisks ** to surround LaTeX strings that the model has incorrectly not determined to introduce a notation.

Warning The automatically_mark_notations function not only adds double asterisks ** to LaTeX math mode strings, but also removes components such as links and footnotes from the text of the note. It is recommended to only apply this function to notes whose text has not been embellished with such components1. Moreover, the automatically_mark_notations is currently buggy and should not be applied to the same note twice.

Creating notation notes for the notations introduced in standard information notes

After the double asterisks ** have been added, we use trouver to automatically create new notes dedicated to each notation and to link these newly created notes

This part does not depend on any ML models. Run the following code:

index_note = VaultNote(vault, name=f'_index_{reference}')
notes = notes_linked_in_notes_linked_in_note(index_note, as_dict=False)

for note in notes:
    try:
        new_notes = make_notation_notes_from_double_asts(note, vault, reference_name=reference)
    except Exception as e:
        print(note.name)
        raise(e)
    # assert len(new_notes) == 0

The following is a comparison of the note before and after the above code is executed:

The standard notation note before the notation notes are created

A note after notations are identified

Even if some extraneously notation notes are created due to incorrect marked double asterisks ** surrounding a LaTeX string, there is nothing to worry about — simply remove those double asterisks, delete the notation note, and delete the link to the notation note.

Moreover, even if some double asterisks ** had not been marked to indicate a notation introduced in a standard information note, running the above code again will safely create the notation note after the double asterisks are correctly added.

For example, the notation_identification_model seems to have deemed the LaTeX strings $e(\pi, q)$ and $e(\pi,q)$ to introduce notations, and marked these latex strings with double asterisks **. The make_notation_notes_from_double_asts function above then created notation notes corresponding to these LaTeX strings. However, the author of trouver does not consider these strings to have actually introduced the notations (because the standard information note containing these strings does NOT define what these notations mean). Let us remove the double asterisks, delete the corresponding notation notes, and delete the links to these notation notes:

A note after notations are identified

Summarizing what the notations denote and recording the summaries in the notation notes

Finally, we use the summarizer pipeline to summarize what the notations denote and record these summaries in the notation notes:

index_note = VaultNote(vault, name=f'_index_{reference}')
notes = notes_linked_in_notes_linked_in_note(index_note, as_dict=False)

for note in notes:
    if not note.exists():
        print(f"note does not exist: {note.name}")
        raise Exception()

print("Summarizing notations")
for note in notes:
    notation_notes_linked_in_note = notation_notes_linked_in_see_also_section(note, vault)
    for notation_note in notation_notes_linked_in_note:
        append_summary_to_notation_note(notation_note, vault, summarizer)

The following is a comparison of the notation note before and after the append_summary_to_notation_note function is applied to the notation note:

The notation note before the summarization happens

The notation note after the summarization happens

Note that the _auto/notation_summary tag is added to the notation note’s YAML frontmatter meta data to indicate that the summarizer pipeline generated the summary.

We highly recommend manually editing the notation note as follows:

  • Correct the notation LaTeX string if necessary (in this example, the notation is the string $F_q(X,Y)$ before the link [[kimpark_ga1dcmmc_29|denotes]] and it looks like there is nothing to correct. However, corrections might be necessary more generally, e.g. )
  • Rename the notation notes to be more searchable. Recall that Obsidian’s Quick switcher provides suggestions for notes to open based on note/file names.
  • Correct the autogenerated summary and remove the _auto/notation_summary tag.

Warning The model is currently not robust enough to generate reliable summaries in general. The model especially struggles with generating LaTeX strings.

For example, this is what the notation note for the string `$F_q(X,Y)$` looks just after the autogenerated summary has been added:

The notation note after the summarization happens in viewing mode

The notation note after the summarization happens in editing mode

Let us fix the summary, remove the _auto/notation_summary tag (to indicate that the summary in the notation note is no longer autogenerated), and rename the note as follows:

Reading view for notation note summary fixed

Edit view for notation note summary fixed

  • If the “notation note” does not correspond to a LaTeX string in a standard information note, then delete the “notation note”.

Applications of the notation notes

We also recommend using notation notes in the following ways:

Embed notation notes in footnotes

It can be useful to (manually) embed notation notes in standard notation notes in footnotes. For example, the following is a (standard information) note presenting a lemma about \(F_q(X,Y)\):

A note with a lemma using the notation $F_q(X,Y)$

Let us add a footnote with the notation note for the notation \(F_q(X,Y)\) embedded:

A note with a lemma using the notation $F_q(X,Y)$

Adding a footnote that embeds the notation note for $F_q(X,Y)$

Let us now view the standard information note in Viewing mode:

A note viewed in Viewing mode with a footnote that embeds the notation note for $F_q(X,Y)$

The following is what the bottom of the note looks like. Note that the notation note has been embedded:

A note viewed in Viewing mode with a footnote that embeds the notation note for $F_q(X,Y)$

In particular, the footnote indicates to the reader/user that the LaTeX string $F_q(s_0,s_1)$ depends upon the notation \(F_q(X,Y)\), which denotes the monic minimal polynomial of the closed point \(q \in \mathbb{P}^1_k\). Such footnotes can be useful for users who need to remind themselves of mathematical facts, but have difficulties recalling the precise denotations of various notations.

Footnotes

  1. More precisely, automatically_mark_notations first applies process_standard_information_note to a MarkdownFile object constructed from the VaultNote object to roughly obtain the “raw text” of the note, uses that raw text to locate notations, marks the notations in the raw text, and then replaces the text from the note with the raw text with notations marked. In the process of obtaining the “raw text”, the process_standard_information_note function removes components such as links and footnotes from the text.↩︎