markdown.obsidian.personal.machine_learning.notation_linking

Functions for gathering and processing data to train and for using ML models to link notation notes with one another.

In a trouver-styled Obsidian.md vault, notation notes summarize the definitions of various notations introduced in excerpts of mathematical text. They are also written to quickly indicate where each notation is introduced.

Stating the definition of a notation often depends on other notations, usually defined before the notation itself. In practice, a notation note lists links to other notation notes that it depends on. Some of the links are made within the “content” of the notation note. Some others are made in a bulleted list of links.

The following is an example of an notation note:

TODO: insert example

This module contains functions to train and use models to predict whether one notation note depends on another.

Gather ML data from notation notes

Gathering data is done for each “reference folder”, which is a group of notes belonging to a single mathematical text. Each data point consists of 1. An ordered pair of notation notes, 2. Some miscellaneous, but precisely quantifiable information about the relationship between the notation notes, and 3. Whether or not the first notation note depends on the second (True/False).

Each data point should be processed in such a way that training/predictions are

When gathering such data, False data points are only partially gathered — this is to prevent bias against True data points, which are relatively rare.


source

data_from_notation_notes

 data_from_notation_notes
                           (origin_notation_note:trouver.markdown.obsidian
                           .vault.VaultNote, relied_notation_note:trouver.
                           markdown.obsidian.vault.VaultNote,
                           include_origin_content:bool,
                           include_relied_content:bool,
                           origin_parsed:Optional[tuple]=None,
                           relied_parsed:Optional[tuple]=None,
                           reference_name:Optional[str]=None,
                           main_of_origin_content:Optional[str]=None,
                           main_of_relied_content:Optional[str]=None, info
                           rmation_notes_of_reference:Optional[list[trouve
                           r.markdown.obsidian.vault.VaultNote]]=None)

*Obtain data for a single pair of notation notes.

Assumes that

  • origin_notation_note and relied_notation_note have the same vault attribute.
  • origin_parsed and relied_parsed are respectively the outputs of parse_notation_note applied to reference_name if specified.
  • reference_name is the correct output of reference_of_information_note applied to main_of_origin and that this output is the same as that when applied to main_of_relied.
  • main_of_origin_content and main_of_relied_content are the outputs of process_standard_information_note(MarkdownFile.from_vault_note(main_of_origin)) and process_standard_information_note(MarkdownFile.from_vault_note(main_of_relied)) respectively if they are specified.
  • information_notes_of_reference correctly lists the standard information notes from the reference in the vault of name reference_name.

Returns - tuple[str, str, str, str, str, str, str, str, str, str, str, str, str, bool] - Consists of the following: 1. The name of the reference from which the notation notes and main information notes come from, 2. The name of origin_notation_note, 3. The name of relied_notation_note, 4. The name of the main information note of origin_notation_note, 5. The name of the main information note of relied_notation_note, 6. The (processed) content of main_of_origin_content, 7. The (processed) content of main_of_relied_content, 8. The content of origin_notation_note 9. The content of relied_notation_note 10. The first entry in the latex_in_original field in the YAML frontmatter metadata or, if unavailable, the notation that is summarized in origin_notation_note 11. The first entry in the latex_in_original field in the YAML frontmatter metadata or, if unavailable, the notation that is summarized in relied_notation_note 12. The notation that is summarized in origin_notation_note 13. The notation that is summarized in relied_notation_note 14. True if origin_notation_note links to relied_notation_note. False otherwises*

Type Default Details
origin_notation_note VaultNote The notation note which potentially uses the notation introduced by relied_notation_note. In particular, there potentially ought to be a link to relied_notation_note in origin_notation_note.| | relied_notation_note | VaultNote | | See the description fororigin_notation_note.,
include_origin_content bool If True, include the content of origin_notation_note, i.e. a summary of what the notation introduced by this note means.
include_relied_content bool If True, include the content of relied_notation_note, i.e. a summary of what the notation introduced by this note means.
origin_parsed Optional None The output of parse_notation_note applied to origin_notation_note
relied_parsed Optional None The output of parse_notation_note applied to relied_notation_note
reference_name Optional None The name of the reference folder in the vault from which the two notation notes comes from. If None, this is computed “on-the-fly” based on the reference of the main note of main_of_origin, see reference_of_information_note
main_of_origin_content Optional None The content of main_of_origin, i.e. the output of process_standard_information_note(MarkdownFile.from_vault_note(main_of_origin)). If None, this is computed “one-the-fly”.
main_of_relied_content Optional None The content of main_of_relied, i.e. the output of process_standard_information_note(MarkdownFile.from_vault_note(main_of_relied)). If None, this is computed “one-the-fly”.
information_notes_of_reference Optional None The standard information notes for the reference folder in order (as arranged in the index notes of the reference folder)
Returns tuple

source

data_points_for_reference

 data_points_for_reference
                            (reference_index_note:trouver.markdown.obsidia
                            n.vault.VaultNote,
                            return_notation_note_parsings:bool=False)

*Compile data points for notation note linking based on the information notes and notation notes in a reference folder.

“Positive” linking data points are relatively rare in comparison to “Negative” data points, so “Negative” data points are randomly sampled (although the random samples will redundantly include “Positive” data as well).

Note that it makes sense to draw data exclusively within each “reference” — notations tend to have dependencies within the same mathematical text.

Returns - Union[list[tuple], tuple[list[tuple], dict[str, tuple]]] - Either 1. a list of tuples — in this case, each tuple is a “data point” and is an output of data_from_notation_notes — or 2. the list of tuples along with a dict whose keys are the names of the notation notes and whose values are the outputs of parse_notation_note applied to these notation notes.*

Type Default Details
reference_index_note VaultNote The index note for the reference from which to draw the data.
return_notation_note_parsings bool False If True, return the outputs of parse_notation_note applied to the notation notes in the reference folder
Returns Union

source

text_from_data_point

 text_from_data_point (data_point:tuple)

Format a data point to present it as a str.

Type Details
data_point tuple An output of data_from_notation_notes.
Returns str

Use the trained model


source

prediction_by_model_via_datapoint

 prediction_by_model_via_datapoint (learn, data_point:tuple)
Type Details
learn The fastai textlearner which makes the prediction.
data_point tuple An output of data_from_notation_notes
Returns bool

source

prediction_by_model

 prediction_by_model
                      (origin_notation_note:trouver.markdown.obsidian.vaul
                      t.VaultNote, relied_notation_note:trouver.markdown.o
                      bsidian.vault.VaultNote, learn,
                      include_origin_content:bool,
                      include_relied_content:bool,
                      origin_parsed:Optional[tuple]=None,
                      relied_parsed:Optional[tuple]=None,
                      reference_name:Optional[str]=None,
                      main_of_origin_content:Optional[str]=None,
                      main_of_relied_content:Optional[str]=None, informati
                      on_notes_of_reference:Optional[list[trouver.markdown
                      .obsidian.vault.VaultNote]]=None)

*Predict whether a notation note depends on the notation summarized by another notation note.

See data_from_notation_notes for details on the parameters, except learn.

See also prediction_by_model_via_datapoint for an alternative function for predictions.*

Type Default Details
origin_notation_note VaultNote
relied_notation_note VaultNote
learn The fastai textlearner which makes the prediction.
include_origin_content bool If True, include the content of origin_notation_note, i.e. a summary of what the notation introduced by this note means.
include_relied_content bool If True, include the content of relied_notation_note, i.e. a summary of what the notation introduced by this note means.
origin_parsed Optional None The output of parse_notation_note applied to origin_notation_note
relied_parsed Optional None The output of parse_notation_note applied to relied_notation_note
reference_name Optional None The name of the reference folder in the vault from which the two notation notes comes from. If None, this is computed “on-the-fly” based on the reference of the main note of main_of_origin, see reference_of_information_note
main_of_origin_content Optional None The content of main_of_origin, i.e. the output of process_standard_information_note(MarkdownFile.from_vault_note(main_of_origin)). If None, this is computed “one-the-fly”.
main_of_relied_content Optional None The content of main_of_relied, i.e. the output of process_standard_information_note(MarkdownFile.from_vault_note(main_of_relied)). If None, this is computed “one-the-fly”.
information_notes_of_reference Optional None The standard information notes for the reference folder in order (as arranged in the index notes of the reference folder)
Returns bool True if model predicts that origin_notat_note depends on the notation summarized by relied_notat_note and hence should link to relied_notat_note.

source