markdown.obsidian.personal.machine_learning.definition_and_notation_naming

Functions for gathering and processing data to train and for using ML models to “name” definitions and notations

trouver.markdown.obsidian.personal.machine_learning.tokenize.def_and_notat_token_classification has functions for gathering and processing data to train and for using ML models to identify definitions and notations introduced in notes via token classification. Identified definition and notations are marked using HTML tags. It would be convenient to predict the “names” for these definitions and notations.

TODO: insert examples of definitions and notations with HTML tags and examples of what the “names” of these definitions and notation should be

Gather ML data from information notes


source

data_from_information_note

 data_from_information_note
                             (info_note:trouver.markdown.obsidian.vault.Va
                             ultNote)

*Obtain data for naming definitions and notations for a standard information note.

Definitions and notations should be marked by HTML tags (see markdown.obsidian.personal.machine_learning.tokenize.def_and_notat_token_classification). - A definition is to be marked by an HTML tag with a definition attribute, which is the definition’s “name”, i.e. words and/or phrases describing what the definition is called and to what objects/situations the definition is applicable. If multiple combinations of words/phrases are appropriate, then they are separated by a single semicolon ;. If the definition attribute is "", then the definition name has not been marked, both manually and automatically. - A notation (technically the full LaTeX string in which the notation is introducedis) is to be marked by an HTML tag with a notation attribute, which is the notation’s “name”, i.e. the actual notation introduced in the LaTeX string (without surrounding dollar signs ($ or $$)). If multiple notations are appropriate, then they are separated by double semicolons ;;. If the notation attribute is "", then it means that either the notation has not been marked, or that the LaTeX string (minus the surrounding dollar signs) is exactly the introduced notation.

Returns - list[dict[str, str]] - Each dict corresponds to a single datapoint, which holds the data of the naming of a single definition or notation (latex str) introduced in info_note. The keys are 'text' and 'definition’ or 'notation’. The text entry should be the processed text of info_note, see process_standard_information_note*

Type Details
info_note VaultNote The standard information note from which to draw data.
Returns list Each dict corresponds to a single datapoint, which holds the data of the naming of a single definition or notation (latex str) introduced in info_note.

Use the ML model


source

predict_names

 predict_names (info_note:trouver.markdown.obsidian.vault.VaultNote, def_a
                nd_notat_pipeline:Optional[transformers.pipelines.text2tex
                t_generation.SummarizationPipeline], def_pipeline:Optional
                [transformers.pipelines.text2text_generation.Summarization
                Pipeline], notat_pipeline:Optional[transformers.pipelines.
                text2text_generation.SummarizationPipeline])

*Predict the names of the definitions and notations using the trained ML models

Either def_and_notat_pipeline or both def_pipeline and notat_pipeline should be provided.*

Type Details
info_note VaultNote
def_and_notat_pipeline Optional A pipeline wrapping an ML model which predicts the naming of both definition and notations.
def_pipeline Optional A pipeline wrapping an ML model which predicts the naming of definitions.
notat_pipeline Optional A pipeline wrapping an ML model which predicts the naming of notations.
Returns list

source

add_names_to_html_tags_in_info_note

 add_names_to_html_tags_in_info_note
                                      (info_note:trouver.markdown.obsidian
                                      .vault.VaultNote, def_and_notat_pipe
                                      line:Optional[transformers.pipelines
                                      .text2text_generation.SummarizationP
                                      ipeline]=None, def_pipeline:Optional
                                      [transformers.pipelines.text2text_ge
                                      neration.SummarizationPipeline]=None
                                      , notat_pipeline:Optional[transforme
                                      rs.pipelines.text2text_generation.Su
                                      mmarizationPipeline]=None,
                                      overwrite:bool=False,
                                      fix_formatting:bool=True,
                                      correct_syntax:bool=True)

*Predict the names of definitions and notations marked with HTML tags within info_note and write those names in the "definition" or "notation" attributes in each tag.

Either def_and_notat_pipeline or both def_pipeline and notat_pipeline should be provided.

An #_auto/notation_notes_linked tag is added to origin_notation_note if such a tag is not already present.*

Type Default Details
info_note VaultNote
def_and_notat_pipeline Optional None A pipeline wrapping an ML model which predicts the naming of both definition and notations.
def_pipeline Optional None A pipeline wrapping an ML model which predicts the naming of definitions.
notat_pipeline Optional None A pipeline wrapping an ML model which predicts the naming of notations.
overwrite bool False If True, overwrite pre-existing, nonempty attributes. If False, ignore pre-existing, nonempty attributes and only write on attributes that are empty.
fix_formatting bool True If True, fix the formatting for notation names.
correct_syntax bool True If True, attempt to fix syntax errors for notation names.
Returns None

Naming notation notes

Another convenient functionality is to name notation notes automatically.


source

add_autogen_name_to_notation_note

 add_autogen_name_to_notation_note
                                    (notation_note:trouver.markdown.obsidi
                                    an.vault.VaultNote, autogen_name:str)

source

sanitize_autogen_name

 sanitize_autogen_name (autogen_name)

source

autogen_name_from_notation_note

 autogen_name_from_notation_note
                                  (notation_note:trouver.markdown.obsidian
                                  .vault.VaultNote, pipeline)

source

predict_name_and_add_to_notation_note

 predict_name_and_add_to_notation_note
                                        (notation_note:trouver.markdown.ob
                                        sidian.vault.VaultNote, notation_n
                                        ote_naming_pipeline:transformers.p
                                        ipelines.text2text_generation.Summ
                                        arizationPipeline, reference:str)

Predict an appropriate name for the notation note and add it in the YAML frontmatter metadata.